agentic-rc-mcp

An MCP server that turns an AI agent into an autonomous operator of interactive local programs. Spawn flutter run, npm run dev, REPLs, TUIs — then drive, observe, introspect, and quit them through structured tool calls. No human in the loop pressing r, copy-pasting log excerpts, or reading the Dart VM Service URL off the terminal.

22 MCP tools · 64 unit tests · 6 live-driven demo scripts · Claude Code skill bundled · v0.6.0 real-world hardened.

The problem

When you tell Claude Code "run my app and verify the new feature works", today it gets stuck in the same place every time:

It spawns the process in the background. ✅
It tails the log a few times. ✅
The log stops scrolling. It can't tell if the app is ready or deadlocked. Asks you.
To trigger hot-reload it has to press r. It can't. Asks you to press it and paste what happened.
Something crashes. The full exception is somewhere in 5000 lines of scroll. It has to grep, guess where the error block ends, hope it didn't miss anything.
The bug is "the counter Text widget doesn't show the right value". The agent can't see the widget. It can only re-read the source code and guess. It has no introspection.

agentic-rc-mcp removes every one of those blockers.

What you get — three layers

Layer	What it does	Why it matters
1. PTY control (8 tools)	Spawn programs in a real pseudo-terminal. Send keys (`<Enter>`, `<Tab>`, `<C-c>`, …). Read the rendered screen — including TUIs like Flutter, vim, top. Wait for patterns with timeout.	The agent can press `r`, see what changed, know when "ready" appeared — exactly like a human at the terminal.
2. Flutter / Dart-VM lifecycle (7 tools)	Auto-detect the VM-service WebSocket URL from `flutter run`'s output. Open a programmatic connection. Trigger hot-reload with a structured `{success, duration_ms}` result. Subscribe to Stdout / Stderr / Logging / Extension / Debug streams. Evaluate Dart in the live app. Capture screenshots.	No more grep-the-console for `Exception caught` — exceptions arrive as structured events with file:line, widget name, stack trace. No copy-pasting Debug URLs.
3. Flutter inspector (3 tools)	Fetch the live widget tree as JSON with source locations. Search by `Key`, runtime type, description substring, or source file. Read any widget's properties — colour, alignment, text content, callback bindings.	The agent can see the UI structurally without a screenshot. "Find the FAB" → `valueId`. "What does the counter Text say?" → `data: "You have pushed the button this many times:"`.

The three layers compose: at the bottom you can still rc_send_keys("r") for anything; at the top you can rc_flutter_widget_find({by: "key", value: "submit"}) and get back an exact widget reference in milliseconds. Same MCP server, one session ID flowing through all of it.

Architecture

+------------------+   stdio    +────────────────────── agentic-rc-mcp ──────────────────────+
|  Claude Code     | <-------> |                                                              |
|  (MCP client)    |  JSON-RPC |   ┌─ SessionManager ─────────────────────────────────────┐  |
+------------------+           |   │   id → Session                                       │  |
                               |   └──────────┬──────────────────────────────────────────┘  |
                               |              │ owns                                          |
                               |   ┌─ Session ▼─────────────────────────────────────────┐    |
                               |   │                                                    │    |
                               |   │   ┌───── PTY layer ──────┐                          │    |
                               |   │   │  node-pty <══>       │ ──→ child process        │    |
                               |   │   │  @xterm/headless     │     (flutter / vite / …) │    |
                               |   │   │  + raw ring buffer   │                          │    |
                               |   │   └──────────┬───────────┘                          │    |
                               |   │              │ feeds                                 │    |
                               |   │   ┌──── Endpoint sniffer ─────────────────────────┐ │    |
                               |   │   │ regex over PTY output → ws / http / devtools │ │    |
                               |   │   └──────────┬───────────────────────────────────┘ │    |
                               |   │              │ unblocks                               │   |
                               |   │   ┌──── VmServiceClient ──── WS ─────► Dart VM       │   |
                               |   │   │   getVM, evaluate,                                │   |
                               |   │   │   streamListen(Stderr,                            │   |
                               |   │   │   Extension, Debug, …)                            │   |
                               |   │   └──────────┬─────────────────                       │   |
                               |   │              │ wraps                                   │   |
                               |   │   ┌──── FlutterService ────┐  ─── ext.flutter.* ───►   │   |
                               |   │   │  error buffer, logs,    │  ext.flutter.inspector.* │   |
                               |   │   │  hot-reload, eval,      │                          │   |
                               |   │   │  screenshot, inspector  │                          │   |
                               |   │   └─────────────────────────┘                          │   |
                               |   └───────────────────────────────────────────────────────┘   |
                               +───────────────────────────────────────────────────────────────+

PTY: real pseudo-terminal via node-pty, so the child program thinks it's interactive (isatty(0)==1).
Screen rendering: @xterm/headless runs xterm.js without a DOM, applying ANSI/curses sequences and exposing the rendered viewport programmatically — so TUIs like Flutter, vim, top render correctly.
Endpoint sniffer: parses every chunk of Flutter output for the four forms Flutter prints (Chrome / macOS desktop / iOS / Android each emit different strings). When the WS URL isn't printed explicitly it's synthesised from the DevTools URL's ?uri= query param or the HTTP URL.
VM-service client: JSON-RPC 2.0 over WebSocket. Used for everything that isn't a keystroke or screen-read.

Tool reference

1. Generic PTY tools (any program)

Tool	Does
`rc_start`	Spawn a command inside a real PTY. Returns `session_id`.
`rc_send_keys`	Write input. Supports `<Enter>`, `<Tab>`, `<Esc>`, `<C-c>`, `<C-d>`, arrows, F-keys, `<M-x>`. Plain text passes through.
`rc_read_screen`	Read the rendered viewport. Modes: `screen` / `scrollback` / `tail`.
`rc_read_stream`	Read raw bytes since a cursor (for log-style apps).
`rc_wait_for`	Block (with timeout) until a pattern appears. Literal substring or `/regex/flags`.
`rc_status`	Status of one or all sessions: pid, state, exit_code, bytes I/O, Flutter endpoints once detected.
`rc_stop`	Terminate a session. SIGTERM → 2 s grace → SIGKILL.
`rc_resize`	Change cols/rows of a running PTY.

2. Flutter / Dart-VM lifecycle tools

Tool	Does
`rc_flutter_endpoints`	Returns sniffed WS / HTTP / DevTools URLs (auto-synthesised on macOS desktop where Flutter omits the WS line).
`rc_flutter_connect`	Opens the VM-service WebSocket + subscribes to Stdout / Stderr / Logging / Extension / Debug. Idempotent.
`rc_flutter_drain_errors`	Returns + clears structured exception events. Use this instead of grepping the console.
`rc_flutter_drain_logs`	Returns + clears structured log events.
`rc_flutter_hot_reload`	Triggers `r`, parses Flutter's report into `{success, libraries_reloaded, duration_ms}` or `{success:false, reason, console_excerpt}`.
`rc_flutter_eval`	Run arbitrary Dart in the root library scope of the main isolate.
`rc_flutter_screenshot`	PNG via `ext.flutter.screenshot`. Graceful `extension_not_registered` fallback on macOS desktop — pair with Peekaboo for that platform.

3. Flutter inspector tools (agentic UI introspection)

Tool	Does
`rc_flutter_widget_tree`	Fetch live widget hierarchy as JSON. Defaults to user-code-only: framework subtrees collapse to `{_elided:true, framework_node_count:N}` markers. Opts: `include_framework`, `source_prefix` (strict path filter), `flat:true` (returns list with ancestry paths instead of nested tree — saves ~70% tokens).
`rc_flutter_widget_find`	Search by `key` / `type` / `description` / `source_contains`. Returns matches with ancestry `path` and `valueId`.
`rc_flutter_widget_properties`	Diagnostic properties of any widget by `valueId` — text content, padding, colour, callbacks (incl. closure name!), ….

4. Agentic gesture injection (tap & verify)

This is where agentic-rc-mcp replaces Peekaboo and chrome-devtools-mcp for Flutter apps — both of which struggle with Flutter's custom-rendered canvas. We don't dispatch OS-level pointer events (the framework's handlePointerEvent is @visibleForTesting and the VM-service eval refuses to compile references to it). Instead the tap tool walks to the nearest interactive widget and invokes its onPressed / onTap closure directly — same setState, same rebuild, same side-effects, no GUI access needed.

Tool	Does
`rc_flutter_tap`	Tap a widget by `key` / `type` / `text` / `value_id` / `coordinate`. Default walker order: self → descendants → ancestors (so custom wrappers like `TPKButton` around `TextButton` work). Detects ambiguous descendants and asks you to disambiguate. `descend:false` opts into the pre-v0.6 self → ancestors-only behaviour.
`rc_flutter_widget_geometry`	Returns `{rect:{x,y,width,height}, widget_type}` for a matched widget — useful for layout verification. Supports `by:'text'`.
`rc_flutter_wait_for_widget`	Block (with timeout) until a widget matching `{by, value}` appears (or disappears, with `appear:false`). Supports `by:'text'`. Bubbles up eval errors instead of polling silently.
`rc_flutter_enter_text`	Fill a TextField / TextFormField. Walks to the underlying `EditableText`, mutates its `TextEditingController.text` (so `onChanged` fires, validators run, listeners notify). Modes: `replace` (default), `append`, `clear`. Must-have for any login / form / search-bar flow — without this the agent can't get past an auth gate.

Diagnostic discipline (v0.6+): every gesture tool result now carries eval_ok / eval_kind / eval_error / expression_preview so a failure tells you why — eval_kind:"@Error" with a Dart compile error is acted upon differently than eval_kind:"@Instance" with reason:"no_callback_found". See docs/learnings/eval-diagnostic-discipline.md.

The composition that makes this powerful: rc_flutter_enter_text to fill, rc_flutter_tap to submit, rc_flutter_widget_find + rc_flutter_widget_properties to verify the state change. End-to-end behavioural testing entirely through MCP. See scripts/flutter-tap-demo.mjs — 7 synthetic taps on the counter app's FAB, each verified by re-reading the Text widget's data property (0 → 7).

Install

Requires Node ≥ 20.

git clone <this-repo>
cd agentic_rc_cli
npm install        # postinstall fixes node-pty's spawn-helper perms on macOS
npm run build
npm link            # makes `agentic-rc-mcp` available globally

Heads-up: npm 10 occasionally extracts node-pty's spawn-helper prebuilt binary without the executable bit, which manifests at runtime as posix_spawnp failed. The included postinstall script (scripts/fix-node-pty-permissions.js) chmods it back. If you ever see that error after a clean install, re-run npm install.

Wire it into Claude Code

Drop .mcp.json next to the project you want the agent to drive (or merge into an existing one):

{
  "mcpServers": {
    "agentic-rc": {
      "command": "agentic-rc-mcp"
    }
  }
}

Restart Claude Code. The tools appear as mcp__agentic-rc__rc_start, mcp__agentic-rc__rc_flutter_widget_find, etc. See .mcp.json.example for variants (direct dist path, dev mode via tsx).

Install the bundled Claude Code skill

This repo ships a Claude Code skill at .claude/skills/agentic-rc/SKILL.md that teaches Claude when to reach for each tool — the canonical Flutter agentic loop, the inspector pattern, named-key cheat sheet, platform gotchas.

Project-local: the skill is auto-loaded when you open Claude Code in this repo's directory.
Global: copy it to your global skills dir so it's available in every project:
```
npm run install:skill
# → ~/.claude/skills/agentic-rc/SKILL.md
```
Idempotent — re-run after each git pull.

Example: the full agentic loop on a Flutter app

// 1. Spawn the app — same as `flutter run` from the terminal.
rc_start { command: "flutter", args: ["run", "-d", "macos"],
           cwd: "/path/to/my-flutter-app" }
// → { session_id: "8fa45718", pid: 79314 }

// 2. Open the Dart VM Service — endpoints are auto-sniffed from the
//    PTY output, no copy-pasting URLs.
rc_flutter_connect { session_id: "8fa45718", wait_ms: 180000 }
// → { connected: true,
//     ws_url: "ws://127.0.0.1:51658/hSQyXpnxQEo=/ws",
//     main_isolate_id: "isolates/6257046507251003" }

// 3. Edit a Dart file (regular Edit / Write tool — not part of this MCP),
//    then trigger hot reload programmatically.
rc_flutter_hot_reload { session_id: "8fa45718" }
// → { success: true, libraries_reloaded: 1, libraries_total: 753,
//     duration_ms: 135 }

// 4. Did the new code throw? Get every exception as a structured event —
//    no console scraping.
rc_flutter_drain_errors { session_id: "8fa45718" }
// → { count: 1, errors: [
//     { timestamp: "2026-…", stream: "Extension",
//       message: "EXCEPTION CAUGHT BY WIDGETS LIBRARY … main.dart:72:5 …" } ] }

// 5. Introspect the live UI to see what's actually rendered.
rc_flutter_widget_find { session_id: "8fa45718",
                         by: "type", value: "FloatingActionButton" }
// → { count: 1, matches: [
//     { valueId: "inspector-11",
//       path: "[root] > MyApp > … > FloatingActionButton",
//       source_location: "lib/main.dart:115:29" } ] }

// 6. Read the bound callback to confirm wiring.
rc_flutter_widget_properties { session_id: "8fa45718",
                               value_id: "inspector-11" }
// → { properties: [
//     { name: "onPressed",
//       description: "Closure: () => void from Function '_incrementCounter@…'" },
//     { name: "tooltip", description: "\"Increment\"" }, … ] }

// 7. ACT — tap the button (no Peekaboo, no chrome-devtools, no GUI access).
rc_flutter_tap { session_id: "8fa45718",
                 by: "type", value: "FloatingActionButton" }
// → { success: true, callback: "FloatingActionButton.onPressed" }
//
// The widget's onPressed closure runs directly. setState fires. Frame rebuilds.

// 8. VERIFY — re-read the counter Text's `data` to confirm the state change.
rc_flutter_widget_find { session_id: "8fa45718", by: "type", value: "Text",
                         refresh: true }
rc_flutter_widget_properties { session_id: "8fa45718",
                               value_id: "<counter-text-valueId>" }
// → { properties: [ { name: "data", description: "\"1\"" }, … ] }

// 9. Run arbitrary Dart in the app's context.
rc_flutter_eval { session_id: "8fa45718",
                  expression: "WidgetsBinding.instance.framesEnabled" }
// → { kind: "Instance", valueAsString: "true" }

// 10. Clean shutdown.
rc_send_keys { session_id: "8fa45718", keys: "q" }
//   …or fall back to a signal:
rc_stop { session_id: "8fa45718", wait_ms: 3000, remove: true }

That sequence is exactly what scripts/flutter-inspector-demo.mjs, scripts/flutter-vm-agentic-loop.mjs, and scripts/flutter-tap-demo.mjs run as end-to-end smoke tests against the sample flutter_example/ counter app. The tap demo executes 7 synthetic taps on the FAB and asserts the counter Text's data property transitions 0 → 7 — pure VM-service, no GUI access.

Named-key cheat sheet (`rc_send_keys`)

Token	Bytes sent
`<Enter>` / `<Return>`	`\r`
`<Tab>`	`\t`
`<Esc>` / `<Escape>`	`\x1b`
`<Space>`
`<Backspace>` / `<BS>`	`\x7f`
`<Delete>`	`\x1b[3~`
`<Up>` `<Down>` `<Left>` `<Right>`	`\x1b[A..D`
`<Home>` / `<End>`	`\x1b[H` / `\x1b[F`
`<PageUp>` / `<PageDown>`	`\x1b[5~` / `\x1b[6~`
`<F1>`..`<F12>`	xterm sequences
`<C-c>` / `<Ctrl-c>` (any letter)	`\x03`
`<M-x>` / `<Alt-x>` (any letter)	`\x1b` + `x`

Plain characters pass through verbatim. Set "raw": true to skip the parser and send literal < / >.

When to use which read tool

rc_read_screen with mode: "screen" — for any TUI that redraws (Flutter, vim, top, npm run dev with spinners). You get what the user would see on the terminal right now.
rc_read_screen with mode: "scrollback" or "tail" — for the history of what was rendered, post-curses processing. Best for log lines that scrolled off the viewport.
rc_read_stream — for pure log-style apps (no cursor tricks) where you want every byte in order, with a cursor for incremental reads.
rc_flutter_drain_errors — once a session has VM-service errors going this is always preferred over PTY grepping. You get structured events with stream origin, timestamp, message, and the raw VM-service payload.

Develop

npm test               # vitest — 33 tests (keys, sessions, endpoints, inspector)
npm run typecheck      # strict tsc --noEmit
npm run build          # emit dist/

# Live end-to-end demo scripts (each drives a fresh MCP server over stdio):
npm run smoke                                # 8 generic PTY tools
node scripts/flutter-drive.mjs               # spawn flutter, hot-reload, quit
node scripts/flutter-error-detect.mjs        # detect runtime exceptions via PTY
node scripts/flutter-vm-agentic-loop.mjs     # full structured loop via VM service
node scripts/flutter-inspector-demo.mjs      # widget-tree + find + properties
node scripts/flutter-tap-demo.mjs            # 7 taps + assert counter 0 → 7
node scripts/flutter-login-demo.mjs          # full login flow: enter email + pw, submit, verify

What this is not (yet)

Not network-remote. Stdio only — MCP client and controlled processes run on the same machine. (Architecture is ready for it; just no transport written.)
Not multi-user. Single process, single session registry, no auth.
No persistence. Killing the MCP server kills every child it started.
No pixel taps inside non-Flutter windows. For Flutter apps we DO fire onPressed/onTap directly via rc_flutter_tap — Peekaboo and chrome-devtools-mcp are no longer needed. For other GUI apps (Electron, native Cocoa, web) you still need an OS-level driver: Peekaboo or chrome-devtools-mcp — then drain errors via this MCP to see what your tap broke.
No Windows yet. node-pty supports ConPTY; untested with this code.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.claude/skills/agentic-rc		.claude/skills/agentic-rc
docs/learnings		docs/learnings
flutter_example		flutter_example
scripts		scripts
src		src
test		test
.gitignore		.gitignore
.mcp.json.example		.mcp.json.example
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
STATE.md		STATE.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agentic-rc-mcp

The problem

What you get — three layers

Architecture

Tool reference

1. Generic PTY tools (any program)

2. Flutter / Dart-VM lifecycle tools

3. Flutter inspector tools (agentic UI introspection)

4. Agentic gesture injection (tap & verify)

Install

Wire it into Claude Code

Install the bundled Claude Code skill

Example: the full agentic loop on a Flutter app

Named-key cheat sheet (`rc_send_keys`)

When to use which read tool

Develop

What this is not (yet)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agentic-rc-mcp

The problem

What you get — three layers

Architecture

Tool reference

1. Generic PTY tools (any program)

2. Flutter / Dart-VM lifecycle tools

3. Flutter inspector tools (agentic UI introspection)

4. Agentic gesture injection (tap & verify)

Install

Wire it into Claude Code

Install the bundled Claude Code skill

Example: the full agentic loop on a Flutter app

Named-key cheat sheet (rc_send_keys)

When to use which read tool

Develop

What this is not (yet)

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Named-key cheat sheet (`rc_send_keys`)

Packages