Skip to content

feat: @celom/prose-observer + apps/console (Prose Console v1)#3

Open
celom wants to merge 17 commits into
mainfrom
feat/prose-observer-console
Open

feat: @celom/prose-observer + apps/console (Prose Console v1)#3
celom wants to merge 17 commits into
mainfrom
feat/prose-observer-console

Conversation

@celom
Copy link
Copy Markdown
Owner

@celom celom commented May 19, 2026

Summary

Introduces a local-dev observability tool for @celom/prose:

  • @celom/prose-observerFlowObserver impl + in-memory ring buffer + HTTP/WS server, packaged with the bundled SPA so end users install one extra dep and get the UI.
  • apps/console — Vite + React SPA with three views:
    • Trace — Gantt timeline of one execution + diff inspector
    • Catalog — per-flow runs / p50 / p95 / error rate with per-step drilldown
    • Live tail — WS-driven feed of in-flight events
  • CLI: prose-console bin and prose console subcommand (lazy-imports the observer package).
  • Docs: README + apps/docs/src/content/docs/guides/console.mdx.

Slices

Eleven sequential slices, one per commit, each independently verified:

  1. Package skeleton + ObserverEvent discriminated union.
  2. ConsoleObserverImpl, EventStream (ring buffer + subscribers), default redaction, correlation id with warn-once, shallow state diff, mergeObservers.
  3. aggregateExecutions() over retained records (runs / p50 / p95 / errorRate / retryRate, per-flow + per-step).
  4. HTTP + WS server: /healthz, /api/executions[/:cid], /api/flows, WS /stream with per-tick backpressure + drop-oldest heartbeat. Loopback-by-default with allowRemote: true opt-in.
  5. SPA shell wired through Vite proxy on /api + /stream.
  6. Trace view: Gantt (parallel → one row, retry badge, skipped/broken styling) + three-pane diff inspector.
  7. Catalog view: flow list + per-step drilldown + last-20 executions linked into the trace.
  8. Live tail: WS feed, pause/resume, auto-scroll toggle, 500-row DOM cap, dropped-heartbeat row.
  9. Build wiring: nx bundle prose-observer produces dist/static/index.html. Cycle broken by isolating the SPA-copy step from the inferred build edges; runtime staticDir auto-resolves via import.meta.url.
  10. CLI: prose-console bin + prose console subcommand with ERR_MODULE_NOT_FOUND install hint.
  11. Docs + README + stateCapture: 'full' oversized-snapshot warn-once.

End-to-end verification

nx run-many -t test typecheck lint -p prose prose-observer console docs green:

  • prose-observer — 49 tests
  • console — 17 tests
  • prose — existing suite preserved
  • docs — nx build docs produces /guides/console/index.html

Live smoke against an in-process flow with { authorization: 'Bearer …' } in the input:

  • GET /api/executions/:id → input.authorization === [REDACTED], userId visible.
  • .parallel() → exactly one step.start + one step.complete (slice-2 invariant).
  • GET /api/flows{ flowName: 'orders', runs: 1, perStep: [3 steps] }.
  • GET / → bundled SPA HTML.
  • prose console --port=4321 standalone serves { ok: true } from /healthz.

Notable design choices

  • Cycle avoidance. Both apps/console and @celom/prose-observer import each other's types. apps/console/package.json drops the back-edge (workspace symlink + no dep-checks rule on console keeps this safe), and the SPA→package bundle copy moves to a separate prose-observer:bundle target. The prose console subcommand uses a string-concat dynamic import to dodge Nx's static graph analyzer.
  • Redaction first. Default keys (authorization, password, apiKey, token, …) are stripped at every event before it reaches the stream, with cycle detection and an 8-level depth cap. The user redact: (event) => event | null composes on top.
  • Loopback by default. Server refuses non-127.0.0.1 bind without an explicit allowRemote: true (red-on-console warning when opted into). No auth — local-dev tool, documented.

Known v1 limitations

Called out in plan + README + docs guide:

  • Parallel sub-branches render as one row (gated on a future @celom/prose API change for named handlers).
  • No cross-restart persistence; ring buffer is in-memory only.
  • Catalog only shows flows that have executed at least once.
  • Concurrent runs of the same flow without options.correlationId may cross-attribute flow-level events (observer warns once per process when missing).

Test plan

  • npm exec nx run-many -t test typecheck lint -p prose prose-observer console docs green
  • npm exec nx bundle prose-observer produces packages/prose-observer/dist/static/index.html
  • In a fresh terminal: node packages/prose/dist/cli.js console --port=4321 then curl http://127.0.0.1:4321/healthz returns {"ok":true}
  • In-process: consoleObserver() + startServer({ port, eventStream: observer.events }) + a flow → /api/executions/:id shows the event log with secrets redacted

🤖 Generated with Claude Code

celom and others added 17 commits May 19, 2026 15:53
Replaces the generator placeholder with the real public surface:

- `ObserverEvent` discriminated union covering every FlowObserver hook
  (flow.start/complete/error/break, step.start/complete/error/retry/skipped),
  each carrying { correlationId, flowName, ts } + per-hook payload.
- `consoleObserver()` factory typed as FlowObserver — returns a no-op for
  now; the ring buffer / redaction / diff impl lands in slice 2.
- @celom/prose declared as a workspace dependency so the linker and the
  @nx/dependency-checks rule both see it; ws is deferred to slice 4 where
  it's actually consumed.

Verify: nx test/typecheck/lint prose-observer all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…, diff

Replaces the no-op factory with the real wiring that runs end-to-end:

- `ConsoleObserverImpl` converts every `FlowObserver` hook into a typed
  `ObserverEvent` carrying { correlationId, flowName, ts } plus the
  per-hook payload. `consoleObserver()` returns it as
  `FlowObserver & { events: EventStream }`.
- `EventStream` groups events by correlationId into `ExecutionRecord`s,
  caps retained executions at `maxExecutions` (default 100) with FIFO
  eviction, and exposes `push` / `subscribe` / `getExecution` /
  `listExecutions` — the surface that slice 4's HTTP server will sit on.
- Default redaction: case-insensitive deep walk that replaces the value
  of common-secret keys (authorization, password, apiKey, token, …)
  with `[REDACTED]`, with cycle detection and an 8-level depth guard.
  Users can extend via the `redact: (event) => event | null` option,
  applied after the default pass; returning null drops the event.
- Correlation ids are auto-generated per `onFlowStart` via
  `crypto.randomUUID()`. When a flow runs without `options.correlationId`,
  the observer warns once per process pointing the user to set it.
- Shallow state diff (added / removed / changed by Object.is) is attached
  to every `step.complete` event by default. `stateCapture: 'full'`
  swaps in before/after snapshots; `'off'` omits state entirely.
- `mergeObservers(...observers)` returns a delegating `FlowObserver`,
  since `flow.execute({ observer })` only takes one.

Locked-in by tests: parallel groups emit exactly one step.start /
step.complete pair, ring buffer evicts the oldest record, the warn fires
exactly once across multiple flow runs, redaction strips known keys at
depth, and the diff payload matches the executor's merge behaviour for
`.step()` results.

Verify: nx test/typecheck/lint prose-observer all green (30 tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`aggregateExecutions(records)` walks the in-memory `ExecutionRecord`s and
rolls them up into:

- `FlowAggregate`: runs, p50/p95 of total flow duration, errorRate, and
  per-step breakdown.
- `PerStepAggregate`: runs (step starts), p50/p95 of step duration,
  errorRate (last-attempt step errors / runs), retryRate (retries / runs).

Quantile is sort-then-linear-interpolate. `broken` flows count as
successful for the flow errorRate — the break is intentional.

Pure function on top of `EventStream` snapshots — slice 4's
`GET /api/flows` endpoint will call it on demand, no separate storage.

Verify: nx test/typecheck/lint prose-observer green (40 tests; 10 new
covering quantile edge cases, multi-flow rollup, per-step retries +
errors, and ordering).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`startServer({ port, host, eventStream, staticDir, allowRemote })` boots a
local-dev HTTP server on `127.0.0.1` (by default) backed by Node's `http`
plus a `ws` `WebSocketServer`:

- `GET /healthz`                       → `{ ok: true }` liveness.
- `GET /api/executions`                → `listExecutions()` summaries.
- `GET /api/executions/:correlationId` → full `ExecutionRecord` JSON (404
  with `{ error: 'not_found' }` for misses).
- `GET /api/flows`                     → `aggregateExecutions()` rollup.
- `GET /*`                             → optional static fallback (slice 9
  wires the SPA here; safe against `..` traversal).
- `WS  /stream`                        → live event push.

Security defaults: refuses to bind to a non-loopback host without
`allowRemote: true`; logs a red startup warning when remote binding is
explicitly opted into. Non-GET methods get 405.

Backpressure on `/stream`: each subscriber queue is flushed once per
`setImmediate` tick. Past the 256-event high-water mark we drop the
OLDEST queued event and surface a `{ type: 'dropped', count }` heartbeat
so the client can re-fetch the affected execution via
`/api/executions/:id` rather than carry on with a torn stream.

`EventStream` gains `listRecords()` for the aggregate endpoint (no copy
beyond the iteration array).

Verify: 8 new server specs (binding defaults, endpoint shapes, 404/405,
non-loopback refusal, WS event delivery, WS-on-wrong-path rejection,
firehose backpressure). Full project: `nx run-many -t test typecheck lint
-p prose-observer` green (48 tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`apps/console` is now an actual Prose-Console SPA, not the Nx welcome
placeholder.

- `src/api.ts` — typed wrappers over the slice-4 endpoints:
  `listExecutions`, `fetchExecution(cid)`, `listFlows`, and a
  `connectStream(onEvent)` WS helper that returns its own `close()`.
- `src/views/Trace.tsx` — slice-5 placeholder that pulls
  `/api/executions/:correlationId` for `?correlationId=...` and dumps
  the raw event list. Intentionally ugly; slice 6 replaces it with the
  Gantt timeline + diff inspector.
- `src/app/app.tsx` — minimal router shell with header nav across
  `/` (trace), `/catalog`, and `/live`. The other two routes show
  slice-targeted placeholders.
- `apps/console/vite.config.mts` — dev-time proxy for `/api` and
  `/stream` (with `ws: true`) at `127.0.0.1:4000`. In production the
  bundle is served by the same observer process so paths resolve
  naturally without a proxy.
- `apps/console/package.json` — adds `@celom/prose-observer` as a
  workspace dep for the type imports in `api.ts` and `Trace.tsx`.
- `examples/console-quickstart.ts` — local-dev script that boots the
  observer server on `127.0.0.1:4000`, runs a multi-step flow with a
  parallel block, a validation step that sometimes fails, and a default
  `authorization` field in the input (covers the redaction path in the
  UI). Loops every 2s; clean Ctrl-C teardown via SIGINT/SIGTERM.
- Removes the auto-generated `nx-welcome.tsx` and rewrites
  `app.spec.tsx` against the new shell using role-based queries.

Verify: nx run-many -t test typecheck lint -p console prose-observer
clean (4 SPA tests + 48 observer tests). The two-terminal smoke
(quickstart.ts + nx dev console) is verified at the e2e step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Trace view at `/?correlationId=<id>` now renders a real timeline.

- `components/Gantt.tsx` — walks `ExecutionRecord.events` into one row
  per step (parallel groups stay a single bar, matching the v1 emit
  contract locked in by slice 2). Status comes from the closing event:
  complete → emerald, error → red, skipped → faded gray, broken →
  purple, in-flight → blue. Retries surface as a `⟲N` badge next to
  the step name. Bars are absolute-positioned inside a percent rail so
  the layout scales to whatever the flow's total duration is.
- `components/DiffInspector.tsx` — three-pane layout: flow input,
  step result, state delta. Reads `state.mode` to pick between the
  shallow `added/removed/changed` view and the `full` before/after
  snapshots. Falls back to a "stateCapture: 'off'" hint when the
  observer omitted state.
- `views/Trace.tsx` — split into `TraceView` (routing + fetch) and a
  pure `TraceContent` consumer. Tests bypass the network by rendering
  `TraceContent` directly with a fixture record.
- Lib `dom` added to `tsconfig.spec.json` so `HTMLElement.getAttribute`
  and `.textContent` resolve in spec files.

Tests (5 new, 9 total in `apps/console`): one row per step, correct
data-status for skipped + broken + complete, retry badge surfaces, row
selection wires through to the diff inspector, and a parallel block
renders as exactly one row whose merged result lists both branch
outputs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`/catalog` lists every flow that has executed at least once and lets
you drill in.

- `views/Catalog.tsx` — split into `CatalogView` (fetch
  `/api/flows` + `/api/executions`) and a pure `CatalogContent`.
  Left table: per-flow runs / p50 / p95 / error %. Right pane:
  per-step stats + the last 20 executions for the selected flow.
  Each execution row is a `<Link>` straight into the trace view at
  `/?correlationId=...`.
- Plumbs the new route into `app.tsx` and drops the slice-7 placeholder
  from the App spec.
- v1 limitation called out in the empty-state copy: flows that have
  never run won't appear (the catalog reads only from the runtime ring
  buffer, not from MCP `analyze-flow`).

Tests (5 new for Catalog, 13 total in `apps/console`): flow rows render
the formatted numbers, per-step drilldown shows by default for the
first flow, clicking a different flow swaps the drilldown, recent
executions are filtered + linked to the trace, and the empty state
fires when no flows are recorded.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`/live` opens a WS subscription to `/stream` and renders incoming
events in arrival order.

- `views/Live.tsx` — subscribes via `connectStream` on mount; the
  `subscribe` prop is injectable so tests skip the real WebSocket.
  500-row in-DOM cap with FIFO eviction. Pause / resume button (uses
  a ref to dodge stale-closure reads from the callback). Auto-scroll
  toggle that no-ops under jsdom (no `scrollIntoView`). Clicking a
  row navigates to the trace view at `/?correlationId=...`. The
  backpressure heartbeat `{ type: 'dropped', count }` renders as an
  amber row that tells the user to refresh the trace for backfill.
- `app.tsx` wires the route; the slice-8 placeholder is gone, along
  with the now-unused `Placeholder` helper.

Tests (5 new, 17 total in `apps/console`): arrival-order rendering,
pause halts and resume continues, the dropped heartbeat surfaces,
clicking a row navigates with the right correlationId in the URL,
and the unmount path calls the unsubscribe returned by `subscribe`.
The `makeStream` helper wraps emissions in `act()` so React 18's
async-state commits before the next assertion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`nx bundle prose-observer` now produces a self-contained npm package
that serves the Console UI from `dist/static/`.

- New `bundle` Nx target on `@celom/prose-observer` (project.json):
  `dependsOn: [build, @celom/console:build]`, copies
  `apps/console/dist/.` → `packages/prose-observer/dist/static/` via
  three sequential `nx:run-commands` (rm/mkdir/cp). `outputs` records
  the static dir so the result is cached.
- Why a separate target and not a build hook: prose-observer's source
  imports trigger an Nx-inferred reverse edge from console back to
  prose-observer (type imports in api.ts). Adding console:build to
  prose-observer:build creates a cycle. `bundle` sits OUTSIDE the
  inferred edges, so the cycle is gone.
- `apps/console/tsconfig.app.json` now writes typecheck `.d.ts` to
  `out-tsc/` instead of `dist/` — previously they polluted the SPA
  dist (alongside vite's index.html), and got copied verbatim into
  the npm static bundle.
- `apps/console/package.json` drops the `@celom/prose-observer` dep
  (the type-only imports still resolve through the workspace symlink
  and there's no `@nx/dependency-checks` rule on the console eslint
  config). Removes the package.json half of the cycle.
- `server.ts` gains `resolveDefaultStaticDir()`: production reads
  `./static/` next to `dist/index.js`; source mode falls back to
  `<workspaceRoot>/apps/console/dist/` (detected via `import.meta.url`
  containing `/src/lib/`). `startServer({ ..., staticDir? })` uses the
  resolved default when none is provided.

Verified: `nx bundle prose-observer` outputs `dist/static/index.html`,
and a fresh `node -e "import('@celom/prose-observer')…"` serves the
SPA at `GET /` (200, `text/html`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two ways to launch the standalone server are now wired up:

- `prose-console [--port=<n>] [--host=<host>] [--max-executions=<n>]`
  — direct bin from `@celom/prose-observer`. Dispatches the included
  arg parser, starts an empty `EventStream` + `startServer`, prints
  `Prose Console: <url>`, and handles SIGINT/SIGTERM with a clean
  `server.close()` before `process.exit(0)`.
- `prose console …` — `@celom/prose`'s CLI gains a `console`
  subcommand that dynamic-imports `@celom/prose-observer/cli.js` and
  calls `main(args.slice(1))`. Missing install surfaces the same
  ERR_MODULE_NOT_FOUND install-hint pattern used for `mcp`.

Implementation notes:

- `packages/prose-observer/src/cli.ts` exports `main(argv)` and only
  auto-runs when it's the entry point. Detection uses Node 22+'s
  `import.meta.main` with a `pathToFileURL` fallback — without this,
  the dynamic import from prose's CLI re-fires `main` on the wrong
  argv and the user sees "unknown argument: console".
- Vite config grows a dual `lib.entry` (`index` + `cli`) and a banner
  that emits `#!/usr/bin/env node` on `cli.js` only, mirroring the
  pattern in `packages/prose/vite.config.mts`.
- `packages/prose-observer/package.json` declares the bin and a new
  `./cli.js` exports entry (with the `@celom/source` condition so
  in-repo callers hit the TS source).
- `packages/prose/src/cli.ts` builds the import specifier from a
  string concatenation (`'@celom/prose-observer' + '/cli.js'`) — Nx's
  static graph analyzer can't follow the join, so the observer-to-prose
  cycle stays unbroken. Without the indirection, `nx run-many` fails
  with `prose → prose-observer → prose`. The optional-dep entry would
  also re-form the cycle, so we leave it out and document the install
  in the error message.
- `packages/prose/eslint.config.mjs` adds `@celom/prose-observer` to
  the `@nx/dependency-checks` `ignoredDependencies` list so the
  dynamic import doesn't trip the lint rule.

Verified end-to-end: `node packages/prose-observer/dist/cli.js
--port=4321` and `node packages/prose/dist/cli.js console
--port=4324` both serve `{"ok":true}` at `/healthz`. Full
`nx run-many -t test typecheck lint -p prose prose-observer console`
clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…guard

Final pass for v1:

- `packages/prose-observer/README.md` rewrites the placeholder with a
  real npm-page README: install, two-line quickstart with
  `mergeObservers(pinoObserver, consoleObserver)`, `prose-console` and
  `prose console` CLI variants, options table, the default redaction
  list, security notes (loopback default, no-auth caveat), and the v1
  limitations called out in the implementation plan.
- `apps/docs/src/content/docs/guides/console.mdx` adds the docs-site
  guide alongside `mcp.mdx`. Walks the three views, mirrors the
  README's redaction + state capture + CLI sections, links back to
  the MCP server guide and the underlying Observability page. Picked
  up automatically by the Starlight `guides/` autogenerate.
- Hardening: `ConsoleObserverImpl` now logs a one-shot warning when
  `stateCapture: 'full'` snapshots cross ~1MB. The two other security
  warnings called for in the plan (non-loopback bind, missing
  `correlationId`) were already in place from slice 2 and slice 4.

Tests: one new spec exercises the oversized-state warn-once latch
(fires once across two flow runs with a 1.2MB blob in state). Full
`nx run-many -t test typecheck lint -p prose prose-observer console`
clean (66 tests across the workspace), and `nx build docs` produces
the new `/guides/console/` page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant