Skip to content

feat(execution-history): add execution history tracking with runs UI#192

Closed
aryasaatvik wants to merge 37 commits intoRhysSullivan:mainfrom
aryasaatvik:execution-history-runs
Closed

feat(execution-history): add execution history tracking with runs UI#192
aryasaatvik wants to merge 37 commits intoRhysSullivan:mainfrom
aryasaatvik:execution-history-runs

Conversation

@aryasaatvik
Copy link
Copy Markdown
Contributor

@aryasaatvik aryasaatvik commented Apr 11, 2026

Summary

CleanShot.2026-04-11.at.23.03.55-converted.mp4
  • Add execution history persistence across SDK, storage, and execution engine layers
  • Implement ExecutionStore interface with SQLite (file) and PostgreSQL (Drizzle) backends
  • Add REST API endpoints for listing and retrieving executions with filtering, cursor pagination, and chart metadata
  • Integrate execution recording into the engine lifecycle (start, interaction, completion)
  • Add full /runs observability UI page with infinite scroll, filter rail (status, time range, code search), timeline chart, and detail drawer
  • React Query provider added to ExecutorProvider for data fetching

Test plan

  • Verify execution history persists across engine restarts (completed, cancelled executions)
  • Test /runs page filtering (status, time range, code search)
  • Verify timeline chart range selection updates results
  • Test detail drawer for execution inspection
  • Verify both SQLite and PostgreSQL backends work correctly

Adds Execution, ExecutionStatus, ExecutionInteraction types and
ExecutionId/ExecutionInteractionId branded IDs. Introduces the
ExecutionStore interface with create, update, list, get, recordInteraction,
resolveInteraction, and sweep methods.
…greSQL

Implements ExecutionStore for file-based storage (SQLite) and PostgreSQL
(Drizzle). Includes 30-day retention sweep for expired executions and their
interactions. New executions and execution_interactions tables with
appropriate indexes.
Create execution record at start of execute/executeWithPause. Record
interactions when elicitation occurs, update status to waiting_for_interaction.
Resolve interactions on resume, propagating response to store. Persist terminal
state (result/logs/error/status) on completion or cancellation.
Add ExecutionStoreService to ExecutorConfig and executor type. Wire
makeInMemoryExecutionStore into promiseExecutor. Call sweep() on shutdown.
Add REST API endpoints: GET /executions (list with filters, cursor pagination,
chart meta) and GET /executions/:id.
…etail drawer

Full observability-style /runs page with infinite scroll, filter rail
(status, time range, code search), timeline chart with clickable range,
and detail drawer for execution inspection. React Query provider added
to ExecutorProvider. Command palette gains Runs navigation. CSS color
tokens for execution status (success/warning/error/info).
Add /runs route to both cloud and local apps with sidebar navigation
entries. Cloud app includes execution history migration.
@aryasaatvik aryasaatvik marked this pull request as draft April 11, 2026 12:35
Add ExecutionToolCall model to record every tools.*.* invocation made during an execution. Adds triggerKind/triggerMetaJson fields to Execution for entry-point attribution, new list filters (triggerFilter/toolPathFilter/after), and toolFacets/triggerCounts to ExecutionListMeta for the runs UI timeline.
Add new execution_tool_calls table to record each tool invocation, trigger_kind/trigger_meta_json/tool_call_count to executions, triggerFilter and toolPathFilter list options, after cursor for timestamp-based pagination, recordToolCall/finishToolCall/listToolCalls methods, and toolFacets/triggerCounts in list metadata. Cascade delete tool_calls on sweep.
…n run

- Add ExecutionTrigger type to identify entry points (HTTP/MCP/CLI)
- Add withToolCallRecording wrapper that writes running→completed/failed
  rows for every sandbox tools.x.y invocation
- Thread toolCallCount through pause/resume boundary
- Persist triggerKind, triggerMetaJson, toolCallCount on execution record
- Expose triggerCounts and toolFacets in list metadata for UI faceting
…ions API

- Add triggerKind, triggerMetaJson, toolCallCount to ExecutionSummary
- Add trigger/tool filters and after cursor to ListExecutionsParams
- Add triggerCounts and toolFacets to ExecutionListMeta
- Add listToolCalls endpoint for per-execution tool call retrieval
- Accept x-executor-trigger header on execute endpoint
- Skip meta computation for live mode (after=) refetches
- Pass x-executor-trigger: "cli" header on CLI execution
- Pass trigger kind (mcp-inline, mcp-pause) on MCP execution
…runs page

- Add ExecutionToolCall type and listExecutionToolCalls API
- Add ToolCallsTab with flame-graph-lite timeline bars
- Add trigger/tool/after query params with accordion filter rail
- Add live mode with 5s polling, past-row divider, isPast flag
- Add keyboard shortcuts (j, r, /, ?, b) and filter command palette
- Convert RunsShell from children render to rows/renderRow props
- Add HoverCardTimestamp, ViewOptionsButton, RefreshButton
- Add via:/tools:/log: columns to run rows
- Replace CSS custom properties with Tailwind equivalents
- Add trigger, tool, live query parameters to runs route handlers
@date-fns/utc for timestamp formatting
react-hotkeys-hook for keyboard shortcut handling
Adds sort by field/direction (createdAt, durationMs) and a hadElicitation
filter to ExecutionListOptions. Introduces interactionCounts in
ExecutionListMeta to power the /runs "Interactions" facet.

New exports: ExecutionSort, ExecutionSortField, ExecutionSortDirection,
pickExecutionSorter.
Removes mcp-inline and mcp-pause trigger kinds in favor of a single "mcp".
These were unnecessarily specific — both go through the same MCP host path
and only differed in whether elicitation was inline or paused. The
hadElicitation filter now captures that distinction independently.
… configurable sort

Adds ExecutionListOptions.hadElicitation to filter elicited vs autonomous runs.
Adds interactionCounts.withElicitation/withoutElicitation to list meta.
Adds sort: { field: "createdAt" | "durationMs", direction: "asc" | "desc" }.
Cursor pagination now uses identity-based slicing to support arbitrary sort keys.
Elicitation is now tracked orthogonally via the Interactions facet
rather than being baked into the trigger kind. Consolidate trigger
schema to three kinds: mcp, http, cli.
- Wire sort state (createdAt, durationMs) through URL, cycle none→desc→asc.
- Add Interactions facet (with/without elicitation) to filter rail.
- Convert filter command from Dialog overlay to always-visible inline input
  with dropdown, exposing a forwarded ref so "/" hotkey focuses it.
- Replace RunRowHeader with per-field RunsColumnHeader (sortable + static).
- Tighten responsive breakpoints: via/log shift to 2xl, tools to lg.
- Abbreviate 'waiting_for_interaction' status to 'waiting' in row label.
- Fix chart tooltip z-index and escape view box handling.
- Add optional sort parameter for ordering runs
- Add optional elicitation parameter for filtering by elicitation type
Normalize line wrapping and structure for consistency across core execution,
storage, and React runs components.
Use useEffectEvent so live-mode cutoff updates read the latest row
state only when live toggles, avoiding stale snapshots.
# Conflicts:
#	apps/cloud/src/routeTree.gen.ts
#	apps/cloud/src/web/shell.tsx
#	packages/react/src/styles/globals.css
Remove provenance headers ("ported from openstatus"), ASCII diagrams,
version history comments ("v1.3 aesthetic"), section dividers, and
over-documented JSDoc across SDK types, engine, API, and React layers.
…amp and splitCsv

- Extract encodeCursor/decodeCursor to @executor/sdk/cursor (was duplicated 3x)
- Delete dead formatTimestamp from row.tsx and detail-drawer.tsx
- Consolidate parseCsv/parseStatuses into splitCsv in runs.tsx
- Add splitCsv helper to handlers/executions.ts
- Inline truncateCode and statusWord in row.tsx
- Inline parseLogs into useMemo in detail-drawer.tsx
- Replace statusLabel() with direct STATUS_LABELS[] lookup, delete function
- Inline parseRange and cycleSort in runs.tsx
- Inline detectActiveKey and replaceTrailingValue in filter-command.tsx
- Inline toolPathsByExecution and hasInteraction in in-memory execution store
Add formatUnknownMessage for extracting messages from Error instances,
strings, objects with .message fields, and arbitrary values via
JSON.stringify fallback. Add formatCauseMessage to squash Effect
Causes into readable strings.
- Add executionOutcomes counter and executionDuration histogram
- Add toolCallCounter and toolCallDuration metrics per tool invocation
- Annotate spans with execution ID, scope ID, trigger kind, interaction kind
- Introduce configurable runPromise for custom OTel tracer layer injection
Add Effect.withSpan tracing to deno-subprocess, dynamic-worker,
quickjs, and secure-exec runtimes with consistent executor.runtime.<name>
naming. Also deduplicate error formatting in runtime-secure-exec by
importing formatUnknownMessage and formatCauseMessage from @executor/codemode-core.
Replace console.error with Effect.logError + annotateLogs in autumn.ts
and protected.ts. Add ManagedRuntime(TelemetryLive) to provide OTel tracer
for withSpan calls so spans export to Axiom when AXIOM_TOKEN is set.
…mitation

Note that distributed tracing covers execution engine and kernel runtimes.
Effect Metrics (counters, histograms) are collected by the engine but not
exported yet due to Cloudflare Workers lacking Node.js APIs needed by
PeriodicExportingMetricReader.
QuickJS double-serializes tool results, requiring recursive JSON parsing.
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Apr 12, 2026

Open in StackBlitz

@executor/sdk

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/sdk@192

@executor/plugin-file-secrets

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-file-secrets@192

@executor/plugin-google-discovery

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-google-discovery@192

@executor/plugin-graphql

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-graphql@192

@executor/plugin-keychain

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-keychain@192

@executor/plugin-mcp

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-mcp@192

@executor/plugin-onepassword

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-onepassword@192

@executor/plugin-openapi

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-openapi@192

commit: fea0149

Only user-facing tool calls are intended for tool-call recording; engine plumbing paths like search, describe, and sources.list are intentionally excluded.
Main's `@executor/storage-core` + `@executor/storage-drizzle` refactor
introduces a generic `DBAdapter` contract that replaces the per-feature
`ToolRegistry` / `SecretStore` / `PolicyEngine` / `ExecutionStore` shape
this branch was built against. Resolution:

- Accept main's architecture wholesale (storage adapter, scoped executor,
  Effect-native engine, OtelTracer with Axiom exporter).
- Delete our custom execution-store implementations in storage-file and
  storage-postgres — superseded by the generic adapter.
- Delete our two custom apps/cloud drizzle migrations (wrong number +
  pre-DBAdapter schema); the port adds fresh migrations.
- Preserve core-schema execution tables (execution, execution_interaction,
  execution_tool_call) as a starting point for the port.
- Preserve HEAD-only execution-history files (sdk/executions.ts, runs UI,
  cursor helper) so the port can reuse their types/contracts against the
  new DBAdapter. These will not type-check until the port lands.

Conflict-level changes:
- Kernel runtimes: adopted main's `executor.code.exec.<runtime>` span
  taxonomy + `runPromise` threading.
- CLI: adopted main's restructured call/resume flow, preserved
  `x-executor-trigger: cli` header.
- MCP server: adopted main's Effect-native server with `runtime` capture
  and span tree, preserved `trigger: { kind: "mcp" }` on engine calls.
- Cloud protected API: dropped our per-request TelemetryLive
  ManagedRuntime in favor of main's globally-installed OTel tracer +
  shared `makeExecutionStack`.
- `makeTrackExecutionUsage` kept alongside main's `AutumnApiApp`.
- React ExecutorProvider: QueryClient wrapper + main's `fallback` prop.

Execution history is not functional at this commit. The port to the
DBAdapter-backed ExecutionStore follows in the next step(s).
Three files the wholesale-accept of main's storage refactor didn't sweep:
these are old-schema migrations and a paired snapshot that were unique
to this branch (main had already moved the postgres adapter to
@executor/storage-drizzle). Removing them now so the drizzle folder
matches the new storage-postgres surface — which no longer owns its own
migrations.
@aryasaatvik
Copy link
Copy Markdown
Contributor Author

Superseded by a 5-PR stack against current main — please review those instead:

  1. feat(sdk): add ExecutionStore backed by DBAdapter #396feat(sdk): ExecutionStore backed by DBAdapter
  2. feat(execution): persist engine runs + tool calls via ExecutionStore #398feat(execution): persist engine runs + tool calls
  3. feat(execution): propagate trigger context from CLI, HTTP, and MCP hosts #399feat(execution): trigger propagation (CLI/HTTP/MCP)
  4. feat(apps): add execution tables to local + cloud drizzle schemas #400feat(apps): execution tables in drizzle schemas
  5. feat(api): /executions list, get, tool-calls endpoints #401feat(api): /executions list/get/tool-calls endpoints

Runs-page UI (the /runs React work from this PR) comes next as PR6 on top of #401. Rebased against the post-merge architecture (DBAdapter / storage-drizzle) so the execution store is generic across sqlite + postgres + memory instead of being hand-written per backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant