feat(execution): persist engine runs + tool calls via ExecutionStore#398
Draft
aryasaatvik wants to merge 2 commits intoRhysSullivan:mainfrom
Draft
feat(execution): persist engine runs + tool calls via ExecutionStore#398aryasaatvik wants to merge 2 commits intoRhysSullivan:mainfrom
aryasaatvik wants to merge 2 commits intoRhysSullivan:mainfrom
Conversation
Adds execution history persistence to the core SDK surface, wiring
three new tables (`execution`, `execution_interaction`,
`execution_tool_call`) into `coreSchema` and exposing an
`ExecutionStore` service on `executor.executions`.
Changes:
- `core-schema.ts`: three new tables with `scope_id` / `execution_id`
/ `tool_path` / `trigger_kind` / `created_at` indexes for the runs
UI's faceting + timeline queries.
- `ids.ts`: branded `ExecutionId`, `ExecutionInteractionId`,
`ExecutionToolCallId`.
- `executions.ts`: `Execution`, `ExecutionInteraction`,
`ExecutionToolCall` Schema classes, status enums,
create/update/filter/sort/meta input types, and the
`ExecutionStore` Context.Tag.
- `execution-store.ts`: `makeExecutionStore(core)` — an
adapter-backed `ExecutionStoreService` implementation. Wraps
`typedAdapter<CoreSchema>` for CRUD, handles cursor-based
pagination, filter predicates (status, trigger, tool-path glob,
time range, code substring, hadElicitation), and builds list meta
with facets + chart buckets.
- `cursor.ts`: base64url `{ createdAt, id }` pagination cursors.
- `executor.ts`: constructs the store once per executor, exposes via
`executor.executions`.
- `executions.test.ts`: round-trip + lifecycle coverage against the
in-memory adapter (no migrations needed).
Follow-up work (future PRs in the stack):
- wire the engine to record runs + tool calls through this store,
- add `/executions` API endpoints, and
- land the runs UI.
This was referenced Apr 24, 2026
aryasaatvik
added a commit
to aryasaatvik/executor
that referenced
this pull request
Apr 24, 2026
Extends the existing `/executions` group with the three read endpoints the runs UI needs. Handlers delegate to `executor.executions.*` (added in RhysSullivan#396 / RhysSullivan#398) and scope each read to the innermost executor scope — same rule the engine applies when writing. **Endpoints:** - `GET /executions` — list with filter + cursor + optional meta. Query params: `limit`, `cursor`, `status` (CSV), `trigger` (CSV), `tool` (CSV of paths/globs), `from`/`to` (epoch ms), `after`, `code` (substring), `sort` (`<field>,<dir>`), `elicitation` (`"true"` / `"false"`). Meta bundles facets + timeline buckets; handler only asks for it when the request isn't paginated (no `cursor` / `after`), so cheap "first page, full facets" is the default call shape. - `GET /executions/:id` — single execution detail + `pendingInteraction`. 404 on unknown id via `ExecutionNotFoundError` (already declared on the group). - `GET /executions/:id/tool-calls` — tool-call timeline. 404 on unknown execution (guard rail so empty arrays don't mask typos). **Response shape:** every `Date` is serialized to epoch ms at the handler edge (`.getTime()`) so the wire format stays numeric. The schemas in `api.ts` mirror the SDK's row projections one-to-one modulo that transform. **CSV + enum handling:** `splitCsv`, `parseSortParam`, `parseElicitationParam` live in the handler file because they're edge concerns — the SDK takes typed arrays and enums. Invalid sort fields / directions drop back to defaults (no 400). No new tests — the handlers are thin wrappers over the SDK store, which already has round-trip + filter + meta coverage in `packages/core/sdk/src/executions.test.ts`. The CSV/enum parsers are small enough to validate by inspection.
Wires `executor.executions` into the Effect-native engine so every
`execute()` / `executeWithPause()` / `resume()` call writes an
`execution` row and its associated tool-call + interaction rows to
whichever `DBAdapter` backs the SDK.
Engine additions:
- `ExecutionTrigger` type + new `trigger?` option on `execute` and
`executeWithPause`. Callers attribute runs ("cli", "http", "mcp",
…); the kind + optional meta blob are persisted on the row.
- A stable `crypto.randomUUID()` execution id is minted at entry and
reused as `PausedExecution.id`, so callers and the DB share the
same identifier and counts line up across pause/resume.
- `makeRecordingInvoker` wraps the `SandboxToolInvoker` passed to the
code executor; each `invoke` writes a tool-call row (running →
completed|failed with duration). Storage errors are ignored so
bookkeeping failures can never fail the tool call itself.
- `persistTerminalState` runs once on fiber success or failure and
writes final status, result/error, logs, toolCallCount, completedAt.
- Pausable path: on elicitation, the execution transitions to
`waiting_for_interaction` and a pending interaction row is created;
`resume` resolves it (or cancels it if action === "cancel") before
unblocking the fiber. A `toolCallCounters` map keeps the same Ref
across pause/resume so the final count is accurate.
- Inline path: wraps the caller-supplied `onElicitation` so every
inline elicitation gets the same pending → resolved bookkeeping.
Tests (`engine-persistence.test.ts`, 5 cases) cover:
- completed run + tool call rows
- error result → status=failed, errorText captured
- toolCallCount rolls up correctly
- trigger kind + meta persist on the row
- failed tool call records status=failed with errorText
70e493d to
3bc7760
Compare
aryasaatvik
added a commit
to aryasaatvik/executor
that referenced
this pull request
Apr 24, 2026
Extends the existing `/executions` group with the three read endpoints the runs UI needs. Handlers delegate to `executor.executions.*` (added in RhysSullivan#396 / RhysSullivan#398) and scope each read to the innermost executor scope — same rule the engine applies when writing. **Endpoints:** - `GET /executions` — list with filter + cursor + optional meta. Query params: `limit`, `cursor`, `status` (CSV), `trigger` (CSV), `tool` (CSV of paths/globs), `from`/`to` (epoch ms), `after`, `code` (substring), `sort` (`<field>,<dir>`), `elicitation` (`"true"` / `"false"`). Meta bundles facets + timeline buckets; handler only asks for it when the request isn't paginated (no `cursor` / `after`), so cheap "first page, full facets" is the default call shape. - `GET /executions/:id` — single execution detail + `pendingInteraction`. 404 on unknown id via `ExecutionNotFoundError` (already declared on the group). - `GET /executions/:id/tool-calls` — tool-call timeline. 404 on unknown execution (guard rail so empty arrays don't mask typos). **Response shape:** every `Date` is serialized to epoch ms at the handler edge (`.getTime()`) so the wire format stays numeric. The schemas in `api.ts` mirror the SDK's row projections one-to-one modulo that transform. **CSV + enum handling:** `splitCsv`, `parseSortParam`, `parseElicitationParam` live in the handler file because they're edge concerns — the SDK takes typed arrays and enums. Invalid sort fields / directions drop back to defaults (no 400). No new tests — the handlers are thin wrappers over the SDK store, which already has round-trip + filter + meta coverage in `packages/core/sdk/src/executions.test.ts`. The CSV/enum parsers are small enough to validate by inspection.
aryasaatvik
added a commit
to aryasaatvik/executor
that referenced
this pull request
Apr 24, 2026
Extends the existing `/executions` group with the three read endpoints the runs UI needs. Handlers delegate to `executor.executions.*` (added in RhysSullivan#396 / RhysSullivan#398) and scope each read to the innermost executor scope — same rule the engine applies when writing. **Endpoints:** - `GET /executions` — list with filter + cursor + optional meta. Query params: `limit`, `cursor`, `status` (CSV), `trigger` (CSV), `tool` (CSV of paths/globs), `from`/`to` (epoch ms), `after`, `code` (substring), `sort` (`<field>,<dir>`), `elicitation` (`"true"` / `"false"`). Meta bundles facets + timeline buckets; handler only asks for it when the request isn't paginated (no `cursor` / `after`), so cheap "first page, full facets" is the default call shape. - `GET /executions/:id` — single execution detail + `pendingInteraction`. 404 on unknown id via `ExecutionNotFoundError` (already declared on the group). - `GET /executions/:id/tool-calls` — tool-call timeline. 404 on unknown execution (guard rail so empty arrays don't mask typos). **Response shape:** every `Date` is serialized to epoch ms at the handler edge (`.getTime()`) so the wire format stays numeric. The schemas in `api.ts` mirror the SDK's row projections one-to-one modulo that transform. **CSV + enum handling:** `splitCsv`, `parseSortParam`, `parseElicitationParam` live in the handler file because they're edge concerns — the SDK takes typed arrays and enums. Invalid sort fields / directions drop back to defaults (no 400). No new tests — the handlers are thin wrappers over the SDK store, which already has round-trip + filter + meta coverage in `packages/core/sdk/src/executions.test.ts`. The CSV/enum parsers are small enough to validate by inspection.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack
Depends on #396 — merge that first. Cross-fork GitHub PRs can't use a branch on a contributor fork as a base, so these five PRs display as independent in the UI. The real dependency chain is the commit graph.
feat(sdk): ExecutionStore backed by DBAdapterfeat(execution): persist engine runs + tool callsfeat(execution): trigger propagation (CLI/HTTP/MCP)feat(apps): execution tables in drizzle schemasfeat(api): /executions list/get/tool-calls endpointsUntil #396 lands, this diff includes its commits. After #396 merges, this diff shrinks to just the engine changes below.
Summary
Wires the
ExecutionStoreadded in #396 into the Effect-native engine. Everyexecute()/executeWithPause()/resume()call now writes execution + tool-call + interaction rows through whicheverDBAdapterbacks the SDK — sqlite, postgres, memory, anything else that implements the contract gets history for free.What ships in this PR (delta beyond #396)
Engine API:
ExecutionTriggertype + newtrigger?option onexecute/executeWithPause. Callers attribute runs (cli,http,mcp, …); kind + optional meta blob persist on the row.crypto.randomUUID()minted at engine entry and reused asPausedExecution.id, so caller-visible ids and DB row ids are the same value.Recording:
makeRecordingInvokerwraps theSandboxToolInvokerpassed to the code executor: eachinvokewrites a tool-call row (running → completed | failed) withdurationMs. Storage failures are ignored so bookkeeping can never fail the tool call itself.persistTerminalStateruns once on fiber success/failure and writes finalstatus,resultJson,errorText,logsJson,toolCallCount,completedAt.waiting_for_interaction, pendingexecution_interactionrow created; on resume the row is resolved (or cancelled ifaction === "cancel") before the fiber is unblocked.toolCallCounterskeeps the sameRefacross pause/resume so the final count is accurate even for multi-pause runs.Test plan
bun x vitest runin@executor/execution— 15/15 tests pass (10 existing + 5 new inengine-persistence.test.ts).bun x tsc --noEmit— zero type errors.bun x vitest runin@executor/sdk— 97/97 tests still pass.New test coverage (
engine-persistence.test.ts)failed, errorText captured.failedwith errorText.