agent-trace

Note

agent-trace is prior systems work, not my current product lane. I keep it public as an OSS reference for local-first trace and replay of AI agent tool use, and as background for my work on reviewable AI-assisted engineering.

Important

The core idea is not “more autonomous agents.” The core idea is making AI-assisted work inspectable after the fact: what ran, what changed, what failed, and where a human should review.

Warning

This project is experimental. Do not treat it as a production security boundary without your own threat model, controls, and testing.

agent-trace

strace for AI agents.

Why

A coding agent rewrites 20 files in a background session. You get a pull request. You do not get the story. Which files did it read first? Why did it call the same tool three times? What failed before it found the fix?

Most tools trace LLM calls. That is one layer. The gap is everything around it: tool calls, file operations, decision points, error recovery, the actual commands the agent ran. agent-strace captures the full session and lets you replay it later. Export to Datadog, Honeycomb, New Relic, or Splunk for production observability. Set rules to stop the agent: cost ceiling, wrong file touched, too many tool calls. The agent stops. No prompt, no retry, no damage.

Install

# With uv (recommended)
uv tool install agent-strace

# Or with pip
pip install agent-strace

# Or run without installing
uvx agent-strace replay

Zero dependencies. Python 3.10+ standard library only.

Quick start

Option 1: CLI hooks — captures prompts, responses, and hook-visible tool calls

agent-strace setup             # Claude Code hooks for ~/.claude/settings.json
agent-strace setup --cli codex # OpenAI Codex hooks in ~/.codex/hooks.json
agent-strace setup --cli gemini # Gemini CLI extension under ~/.gemini/extensions
agent-strace setup --cli cursor # Cursor project hooks in .cursor/hooks.json
agent-strace setup --cli copilot # GitHub Copilot CLI hooks in ~/.copilot/hooks
agent-strace list              # list sessions
agent-strace replay            # replay the latest

Full config and JSON: docs/setup.md

Option 2: MCP proxy — wraps any MCP server, works with Cursor, Windsurf, and Copilot Desktop MCP servers

agent-strace record -- npx -y @modelcontextprotocol/server-filesystem /tmp
agent-strace replay

Option 3: Python decorator — no MCP required

from agent_trace import trace_tool, start_session, end_session

start_session(name="my-agent")

@trace_tool
def search_codebase(query: str) -> str:
    return search(query)

end_session()

Full setup guide: docs/setup.md

What you can do

Understand a session

Command	What it does
`agent-strace replay <id>`	Replay a session in the terminal or as HTML
`agent-strace replay <id-a> --diff <id-b>`	Side-by-side session comparison with tool args and output delta
`agent-strace explain <id>`	Plain-English phase summary, no LLM required
`agent-strace timeline <id>`	Phase-by-phase view with costs and retries
`agent-strace why <id> <event>`	Causal chain for a specific decision
`agent-strace diff <id-a> <id-b>`	Structural or semantic session comparison
`agent-strace compare <id-a> <id-b>`	Regression report with verdict

Control and protect

Command	What it does
`agent-strace watch`	Live monitor with kill-switch rules
`agent-strace watch --timeout 30m --budget $5`	Watchdog mode — kills on limit and heartbeats sessions for postmortems
`agent-strace mcp-scan`	Scan runtime MCP poisoning indicators
`agent-strace audit <id>`	Audit tool calls against a policy file
`agent-strace approval list`	Human-in-the-loop approval queue
`agent-strace rbac assign`	Org and workspace-scoped role assignments
`agent-strace auth login`	SSO/OIDC login to a hosted collector
`agent-strace apply`	Apply `.agent-strace.yaml` config to local store or collector
`agent-strace workspace new`	Create an isolated workspace
`agent-strace compliance export`	Export compliance reports (EU AI Act, SOC 2, HIPAA)
`agent-strace record`	Strip secrets from traces before storage by default
`agent-strace export --anonymize`	Remove PII at export time

Analyse across sessions

Command	What it does
`agent-strace dashboard`	Multi-session overview
`agent-strace budget-report`	Weekly spend digest
`agent-strace team-report`	Team spend by author, branch, or PR
`agent-strace cognitive-debt`	Unreviewed agent-written code by session
`agent-strace context-score`	Score AGENTS.md and CLAUDE.md from session outcomes
`agent-strace lint <id>`	Flag bad behaviour patterns (loops, spirals, waste)
`agent-strace drift`	Detect behavioural drift over time
`agent-strace fingerprint`	Baseline an agent's behavioural profile
`agent-strace tree`	Show parent/child session hierarchy
`agent-strace freeze`	Freeze a tool-call sequence for regression checks
`agent-strace standup`	Plain-English summary of yesterday's sessions
`agent-strace eval <id>`	Score a session against behavioural baselines
`agent-strace eval ci`	Fail CI on behavioural regression

Export and integrate

Command	What it does
`agent-strace export --format otlp-genai`	Export to Datadog, Honeycomb, Grafana, Jaeger
`agent-strace export --format eu-ai-act`	Generate Article 12/13 audit packages
`agent-strace export --metrics`	Export per-session behavioral metrics as OTLP gauges
`agent-strace identity show`	Machine identity — sign and verify sessions
`agent-strace server`	Server-side collector for multi-agent, multi-machine
`agent-strace share <id>`	Generate a shareable HTML replay
`agent-strace sample`	Export worst sessions as JSONL for eval datasets

Full flag reference: docs/commands.md

VS Code extension

Install agent-strace from the Extensions panel to see live session activity without leaving the editor.

Feature	Description
Status bar	Live cost, tool call count, and active tool name. Click to open the event stream.
Gutter annotations	Blue border on files the agent read, amber on files it modified.
Event stream panel	Live feed: every tool call, file op, LLM request, and error.
Pause button	Stops the agent mid-session via SIGSTOP.

pip install agent-strace   # 1. install
agent-strace setup         # 2. add hooks to Claude Code; use --cli codex, gemini, cursor, or copilot for other CLIs
# 3. open project in VS Code — extension activates when .agent-traces/ exists
# 4. start Claude Code — status bar appears immediately

Full docs: docs/vscode.md

Production

OTLP export — sessions become traces, tool calls become spans:

agent-strace export <session-id> --format otlp-genai \
  --endpoint http://localhost:4318

Per-backend setup (Datadog, Honeycomb, Grafana, New Relic, Splunk, Langfuse): docs/production.md

Server-side collector — for containers, CI, and multi-machine setups:

agent-strace server --port 4317 --storage ./traces
AGENT_STRACE_ENDPOINT=http://collector:4317 python my_agent.py

Full guide: docs/server.md

Auto-instrumentation — no code changes required:

from agent_trace.integrations import instrument_langchain
instrument_langchain()

Supported: OpenAI Agents SDK, LangChain, LangGraph, CrewAI, LiteLLM, Anthropic SDK, OpenAI SDK, AWS Strands. Guide: docs/integrations.md

GitHub Actions — run evals in CI, post results to the step summary, fail on regression:

- uses: Siddhant-K-code/agent-trace@gha-v1
  with:
    config: .agent-evals.yaml
    baseline: .agent-evals-baseline.json
    tolerance: "0.05"

Marketplace listing · Action reference

How it works

Claude Code hooks — Claude Code fires hook events at every stage of its agentic loop. agent-strace registers as a handler, reads JSON from stdin, and writes trace events. Each hook runs as a separate process; session state in .agent-traces/.active-session correlates PreToolUse and PostToolUse for latency measurement.

MCP stdio proxy — sits between the agent and the MCP server, reads JSON-RPC messages (Content-Length framed or newline-delimited), classifies each one, and writes a trace event. Messages are forwarded unchanged. The agent and server do not know the proxy exists.

MCP HTTP/SSE proxy — same idea, different transport. Listens on a local port, forwards POST and SSE requests to the remote server, captures every JSON-RPC message in both directions.

Python decorator — @trace_tool logs a tool_call event before execution and a tool_result after. Errors and timing are captured automatically. @trace_llm_call does the same for LLM calls.

Running tests

python -m unittest discover -s tests -v

License

MIT. Use it however you want.

Sponsor · ADRs · Security · PyPI

Name		Name	Last commit message	Last commit date
Latest commit History 176 Commits
.github		.github
ADRs		ADRs
assets		assets
docs		docs
examples		examples
src/agent_trace		src/agent_trace
tests		tests
vscode-extension		vscode-extension
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Agent trace.png		Agent trace.png
LICENSE		LICENSE
README.md		README.md
action.yml		action.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

agent-trace

Why

Install

Quick start

What you can do

Understand a session

Control and protect

Analyse across sessions

Export and integrate

VS Code extension

Production

How it works

Running tests

License

About

Uh oh!

Releases 98

Sponsor this project

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

agent-trace

Why

Install

Quick start

What you can do

Understand a session

Control and protect

Analyse across sessions

Export and integrate

VS Code extension

Production

How it works

Running tests

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 98

Sponsor this project

Uh oh!

Contributors

Uh oh!

Languages