M1 PR-5: generation adapters — Ollama structured outputs + Anthropic#3
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
Introduces a stateless generation-adapter layer (ADR-9 seam) to unify model-provider calls behind a single GenerationAdapter interface, adding concrete adapters for Ollama (local structured outputs) and Anthropic (SDK structured outputs), along with offline/deterministic fixture-based tests and a non-CI Ollama smoke script.
Changes:
- Add
GenerationAdapter/ request+result types, typedAdapterOutputError, and shared parsing/helpers. - Implement
OllamaAdapter(/api/chatwithformat) andAnthropicAdapter(SDKoutput_config.format: json_schema) with typed failure surfacing. - Add deterministic adapter tests plus a live
smoke:ollamascript; add@anthropic-ai/sdkdependency.
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/adapters/types.ts | Defines the adapter seam types, typed adapter errors, JSON parsing helper, and --model ref parsing. |
| src/adapters/ollama.ts | Implements Ollama /api/chat adapter using structured outputs and returns parsed JSON + meta/usage. |
| src/adapters/anthropic.ts | Implements Anthropic SDK adapter using JSON-schema output formatting and typed stop-reason errors. |
| src/adapters/index.ts | Re-exports adapter APIs and provides adapterFor(--model ...) factory. |
| src/adapters/adapters.test.ts | Adds offline, deterministic fixture tests covering schema round-trip + typed failure cases + model identity rules. |
| scripts/smoke-ollama.ts | Adds a live (non-CI) Ollama smoke generator and runs S1–S3 surface linting on the artifact. |
| package.json | Adds smoke:ollama script and @anthropic-ai/sdk dependency. |
| package-lock.json | Locks @anthropic-ai/sdk and its transitive dependencies. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| throw new AdapterOutputError( | ||
| adapterId, | ||
| "model output is not valid JSON — constrained decoding was not applied or failed (see docs/spike-structured-outputs.md: some engines silently ignore output schemas)", | ||
| raw, |
| .join(""); | ||
| if (raw === "") throw new AdapterOutputError(this.id, "empty model output"); | ||
| if (message.stop_reason === "max_tokens") { | ||
| throw new AdapterOutputError(this.id, "output truncated (stop_reason: max_tokens) — raise params.maxTokens", raw); |
Comment on lines
+69
to
+71
| const data = (await response.json()) as OllamaChatResponse; | ||
| const raw = data.message?.content ?? ""; | ||
| if (raw === "") throw new AdapterOutputError(this.id, "empty model output"); |
… PR-5) - src/adapters/: stateless GenerationAdapter interface per ADR-9 (one attempt per call; the repair loop owns conversation state). Model identity is configuration: constructors require an explicit model id, no default model name exists in code — enforced by a source-scan test. - OllamaAdapter: /api/chat with format = the generation schema. Non-JSON output raises AdapterOutputError (the S0 spike found the mlx engine silently ignoring format; gates S1/S2 judge conformance over the artifact). - AnthropicAdapter: official SDK, output_config.format json_schema (the generation schema is compatible by construction: depth-unrolled, closed objects). No sampling params sent (removed on current models). refusal and max_tokens stop reasons surface as typed errors — never silently retried. - Offline deterministic tests via injected fetch fixtures: parsed results, schema round-trip verbatim into the request body, typed failures. - scripts/smoke-ollama.ts (live, non-CI): one real generation through the compiled context + S1-S3 lint. First live runs recorded in the spike addendum: the 8B model passes S1/S2 but fails S3 on every attempt (nested-interactive violations) even with rule steering in the prompt — live confirmation that the guarantee is the linter, not the prompt. Verify: npm test (44 tests, offline); npm run smoke:ollama -- --model <tag> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this is
PR-5 of Milestone 1 (stacked on #2). The generation-adapter seam (ADR-9 as amended):
GenerationAdapterinterface — one attempt per call; the repair loop owns conversation state. Model identity is configuration: constructors require an explicit model id and a source-scan test asserts no model name exists in adapter code.OllamaAdapter—/api/chatwithformat= the generation schema. Non-JSON output raises a typedAdapterOutputError(never a silent retry): the S0 spike found the mlx engine silently ignoringformat, so adapters guarantee only parseability — conformance is judged by gates S1/S2 over the artifact.AnthropicAdapter— official SDK,output_config.format(json_schema); the generation schema is compatible by construction (depth-unrolled/non-recursive, closed objects). No sampling params sent.refusalandmax_tokensstop reasons surface as typed errors.scripts/smoke-ollama.ts— live, non-CI: one real generation through the compiled context + S1–S3 lint.Acceptance (in CI on this PR)
Hand-review focus
src/adapters/types.ts— theGenerationAdapter/GenerateRequest/GenerateResultinterface (M1's third public interface).ADRs: 9 (with 2/3 context).
🤖 Generated with Claude Code