A standard for providing agents the ability to run commands to validate their work.
The rise of AI-powered code generation tools like Cursor, GitHub Copilot, and Claude has transformed how we write code. However, these tools often generate code without understanding your project's specific development workflow, leading to inconsistent formatting, broken tests, and type errors.
- Agents don't know how to validate code formatting - Generated code often doesn't match your project's formatting standards (Prettier, ESLint rules) and requires manual cleanup after every generation
- Agents don't know how to test your code - Generated code may introduce breaking changes or regressions, but agents don't verify that existing tests still pass before suggesting changes or incorrectly do so using the wrong tools
- Agents don't know how to typecheck your code - Generated code often doesn't pass type-check properly, leading to compilation errors that could be caught earlier if the agent knew how to validate it properly
- Agents don't know how to run your code - There are 100 different ways to run the code,
npm start
vsyarn dev
etc. and the agents don't know how to run your code so they can validate it - Agents don't know how to configure their environment - Agents need to run certain commands that usually require some sort of environment (aka how do you run E2E tests without your
.env
setup properly)
We need our agents to not only generate code, but validate that the code they wrote:
- ✅ passes tests
- ✅ is formatted correctly
- ✅ can transpile
The agents.js
standard leverages the existing scripts
(see docs) functionality built into npm
, pnpm
, yarn
and other package managers.
To configured the agents
command you need to handle a couple of use cases:
- monorepos: running at
root
vs aproject/workspace
- environment: how to setup the env for the agent
- scope: what commands should they run and against what code
To achieve this you need to add a script to the package.json
at the root of the repository similar to the following:
{
"scripts": {
"agents": "CI=true pnpm run --tui=false -t"
}
}
As you can see this is using pnpm
directly (you can switch for yarn
, npm
, or others) to just proxy to the correct command and include the CI=true
environment variable.
What this does in this case is allow for the scripts
(aka tasks
) to run using a CI-like environment as test runners and others usually pick up on the variable. This could also be used to do things like passing the infisical
environment variables automatically:
{
"scripts": {
"agents": "infisical run --projectId=agent-project --env=dev --silent -- pnpm run --tui=false -t"
}
}
The agents.js
standard also supports the concept of monorepositories:
{
"scripts": {
"agents": "CI=true nx run-many --tui=false --parallel --projects=packages/* -t"
}
}
As you can see this is using nx
but many if not all monorepo management systems allow a similar syntax. What this provides is the ability to run commands like so from the root:
pnpm agents test
- run tests across all packages
pnpm agents check
- run linting and formatting across all packages
You can also tweak the configuration to run against the entire codebase like so:
{
"scripts": {
"agents": "CI=true nx run-many --tui=false --parallel -t"
}
}
By removing --projects=packages/*
it will run against the entire codebase. At the project
level (the code in the packages
or apps
dir) you can do something like so:
{
"scripts": {
"agents": "CI=true nx run --tui=false"
}
}
This will only run the command at the specific project, allowing your Agents to target specific projects they need to validate.
Agents expect certain commands to exist and we recommend using these:
test
- Run the script(s) for testscheck
- Run the script(s) linting and formattingfix
- Run the script(s) for automatic fixes to linting and formattingtypecheck
- Run the script(s) TypeScript type checkingbuild
- Run the script(s) to build (transpile) the codedev
- Start development server or script(s)validate
- Run any validation the script(s), this used to combine things likecheck
,typecheck
,test
all into one command
You will need to configure the agents to utilize these commands, by updating your AGENTS.md
, CLAUDE.md
or others (i.e. Cursor rules).
Caution
Work in progress
You can view pre-built examples here.