Agents.Code — Autonomous 3-Agent Coding Harness

An autonomous coding harness built with the Claude Agent SDK. Takes a short prompt and autonomously builds an application using three specialized agents. The planner picks the stack based on the prompt — the harness itself is stack-agnostic.

Architecture

User Prompt → [Planner] → spec.md → [Generator] → app → [Evaluator/QA] → feedback
                                          ↑                                    │
                                          └────────── fix round ◄──────────────┘

Agent	Role	Tools
Planner	Expands 1-4 sentence prompt into full product spec	File I/O
Generator	Builds frontend and backend from spec	File, Bash, Git
Evaluator	QA tests running app, grades against criteria	Playwright MCP

Quick Start

# Install dependencies
npm install

# Set your API key
cp .env.example .env
# Edit .env with your ANTHROPIC_API_KEY

# Install Playwright (for evaluator agent)
npx playwright install chromium

# Run the harness
npx tsx src/index.ts "Build a task management app with kanban boards"

Options

--output-dir <path>     Output directory (default: ./output)
--model <model>         Claude model
--max-rounds <n>        Max QA rounds (default: 3)
--max-budget <usd>      Max budget in USD (default: 50)

How It Works

Planning — The planner agent takes your short prompt and expands it into an ambitious product spec with features, user stories, design direction, and AI-powered features.
Building — The generator agent reads the spec and builds the complete application in whatever stack the planner chose, committing to git at milestones.
QA — The evaluator agent uses Playwright to interact with the running app like a real user. It grades against four criteria (Product Depth, Functionality, Visual Design, Code Quality) and files specific bugs with file/line references.
Iteration — If QA fails, the generator gets the evaluator's feedback and fixes issues. This loop repeats up to --max-rounds times.

Skills & Plugins

# Frontend design skill
npx skills add vercel-labs/agent-skills

# Nextjs best practices
npx skills add vercel-labs/next-skills --skill next-best-practices

# Playwright MCP (used by evaluator)
npx playwright install chromium

References

Harness Design for Long-Running Apps — Anthropic Engineering
Claude Agent SDK — Official Docs
DotNet Skills — .NET agent skills

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
docs		docs
examples/weather		examples/weather
results		results
skills		skills
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
plan.md		plan.md
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agents.Code — Autonomous 3-Agent Coding Harness

Architecture

Quick Start

Options

How It Works

Skills & Plugins

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agents.Code — Autonomous 3-Agent Coding Harness

Architecture

Quick Start

Options

How It Works

Skills & Plugins

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages