Land the M1 stack on main (stacked PRs #2–#5 merged into their bases) by ryandmonk · Pull Request #8 · aestheticfunction/dspack-gen

ryandmonk · 2026-07-02T21:34:19Z

Mechanical landing PR — all content already reviewed and merged as #2–#5.

What happened (same failure mode as aestheticfunction/dspack-to-a2ui#7): the stacked PRs were merged into their stacked base branches, not main — #2→feat/context-compiler, #3→feat/surface-gates, #4→feat/adapters, #5→feat/pipeline. Only #1 and #6 actually reached main, so main currently has the PR-3 compiler plus docs/evidence and none of: the linter (S1–S3), adapters, orchestrator/audit report, serve, or the Playwright gate.

Additionally, #5 was merged at 814d425, just before the serve-hardening + pin-bump commit (a9dc0a9, the Copilot review response with all five threads resolved and both CI jobs green) landed on its head branch — so that commit is stranded too.

This PR's head is feat/demo-e2e: the complete linear stack including the hardening commit and the b47a2cf dep pin (= merged dspack-to-a2ui main). Content is identical to what was reviewed across #2–#5 plus the already-reviewed review-response commit. The #6 docs on main are untouched by this branch (no conflicts).

Suggested going forward: merge stacked PRs bottom-up with branch deletion enabled (GitHub then auto-retargets children to main), or retarget each child to main before merging.

🤖 Generated with Claude Code

…luators (M1 PR-4) - src/core/lint/: gate runner (S1 generic surface schema, S2 contract vocabulary, S3 governance) — independently reported, never implicit in generation (the S0 spike found Ollama's mlx engine silently ignoring `format`, which is why S2 is a check on the artifact). - rule-type registry + evaluators per spec/dspack-v0.3.md §5.3 semantics. Findings carry both severity faces (requirement: must|should, level: error|warn); rationales verbatim; locations as $.root… paths. Unknown rule types throw UnknownRuleTypeError (CLI exit 4) — never skip. - DEVIATION FROM THE M1 DIRECTIVE, flagged for review: forbidden-composition is implemented now (not M2/PR-8). Forced by a conflict discovered in implementation: the v0.3 shadcn contract carries a UNIVERSAL forbidden-composition rule (rule.button-no-interactive-descendants) and spec §5.4 forbids skipping unimplemented types — a two-evaluator linter would exit 4 on every lint of the real contract. Fixture F5 activates with it (all five golden fixtures active). - CLI `lint`: JSON report on stdout (golden-comparable), human rendering on stderr; exits 0 clean / 2 any S-gate error / 4 unknown rule type. - fixtures/golden/violating/F1-F5 + clean golden + checked-in expected reports; core-boundary test now walks recursively (lint/ included), ajv allowed as the only non-node bare import. Verify: npm test; npx tsx src/cli.ts lint --dspack fixtures/shadcn.v0_3.dspack.json --surface fixtures/golden/violating/F1-dialog-for-delete.dsurface.json (exit 2, stdout equals F1-dialog-for-delete.expected.json) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…9 review fix) Semantic alignment with the now-normative spec v0.3 §5: S2 and rule resolution work by sub-component id alone, so duplicate ids across components fail loudly — naming the id and every declaring component — before any id-dependent check, instead of resolving by object iteration order. Same error shape as the dspack validate harness; covered by a new lint test. No golden outputs change (the shadcn contract has no duplicates). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

… PR-5) - src/adapters/: stateless GenerationAdapter interface per ADR-9 (one attempt per call; the repair loop owns conversation state). Model identity is configuration: constructors require an explicit model id, no default model name exists in code — enforced by a source-scan test. - OllamaAdapter: /api/chat with format = the generation schema. Non-JSON output raises AdapterOutputError (the S0 spike found the mlx engine silently ignoring format; gates S1/S2 judge conformance over the artifact). - AnthropicAdapter: official SDK, output_config.format json_schema (the generation schema is compatible by construction: depth-unrolled, closed objects). No sampling params sent (removed on current models). refusal and max_tokens stop reasons surface as typed errors — never silently retried. - Offline deterministic tests via injected fetch fixtures: parsed results, schema round-trip verbatim into the request body, typed failures. - scripts/smoke-ollama.ts (live, non-CI): one real generation through the compiled context + S1-S3 lint. First live runs recorded in the spike addendum: the 8B model passes S1/S2 but fails S3 on every attempt (nested-interactive violations) even with rule steering in the prompt — live confirmation that the guarantee is the linter, not the prompt. Verify: npm test (44 tests, offline); npm run smoke:ollama -- --model <tag> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

… PR-6) - src/run/orchestrator.ts: generate → surface gates S1-S3 → bounded repair (default 2; system prompt immutable across attempts — the only delta is the model's own output + rendered repair feedback, snapshot per attempt) → emit via the pinned @aestheticfunction/dspack-to-a2ui git dependency → emitter gates A1-A3 (both A2UI versions) → audit report v1. Every outcome is a first-class artifact: passed / failed-lint-exhausted (exit 2) / failed-gate (exit 3) / failed-adapter (exit 1; added to the plan's enum — the S0 spike showed runtimes can fail to constrain, and that must be a reported outcome, never a silent retry). - src/repair/render.ts (ADR-7): one findings object, two serializations — the repair message is rendered deterministically from the same findings embedded in the report, with linked examples verbatim as corrected references. Golden-file tested. - src/audit/: report v1 + schemas/audit-report.v1.schema.json + markdown rendering + docs/AUDIT.md (additive-only guarantee, stable enums, reproducibility fields: contract sha256, schema sha256, adapter id, per-attempt model + provider meta). - src/adapters/fake.ts: ScriptedAdapter — the deterministic instrument for CI, the demo's verification mode, and eval goldens. - CLI `run` writes audit-report.json/.md + generated.surface.json; exit codes per the README table. Live verification (recorded in out/ locally, not committed): qwen3:8b violates S3 on attempt 1 (5 findings), repairs to 1, exhausts honestly (failed-lint-exhausted); gpt-oss 20B passes S1-S3 + A1-A3 on attempt 1. Verify: npm test (52 tests, offline/deterministic); npx tsx src/cli.ts run --dspack fixtures/shadcn.v0_3.dspack.json --intent destructive-action --prompt "a screen to delete my account" --model ollama:<tag> --out out Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…R-7) - src/serve.ts: `dspack-gen serve` — localhost-only node:http endpoint (incidental plumbing, no framework). POST /run streams PipelineEvent NDJSON (start with applicable rule ids / attempt with S1-S3 gates + findings / repair message verbatim / emitted A1-A3 / done with the full audit report v1). fake:true runs the deterministic ScriptedAdapter (golden violating fixture F1 → the contract's worked example); live mode requires an explicit model reference — no default model in code. - orchestrator: observational onEvent hook (the report stays the artifact). - e2e/flagship.spec.ts (`npm run demo:e2e`): drives the demo app's Generate view against serve in fake mode and asserts the entire flagship trail — violation with verbatim rationale, exact repair message, clean attempt with rule.alertdialog-requires-cancel listed as VERIFIED, A1-A3 green for both A2UI versions, the AlertDialog rendering + opening with cancel-before-confirm (and closing on Cancel), and the downloaded audit report validating against schemas/audit-report.v1.schema.json. DEMO_DIR points at a dspack-to-a2ui checkout with the Generate view (CI: sibling checkout). Verify: npm test; DEMO_DIR=<dspack-to-a2ui checkout> npm run demo:e2e Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ged main Review fixes (Copilot on #5): - CORS restricted to the demo dev-server origins (localhost/127.0.0.1:5173, reflected with vary: origin) with allow-methods on preflight — no more wildcard readable by arbitrary sites while the server runs. - Request bodies capped at 64KB (413 + destroy) — no unbounded buffering. - Fake mode selects the worked example BY INTENT and fails fast (400) when the contract has none — never scripts `undefined`. - onEvent is observational by enforcement: a throwing hook (e.g. stream write after client disconnect) is swallowed and can never change a run's outcome. Covered by a pipeline test. - --port validated (integer 1-65535) with a clear usage error. - New serve.test.ts: CORS allow/deny, 413, fake-no-example 400, full NDJSON event sequence (59 tests total). Also bumps the pinned @aestheticfunction/dspack-to-a2ui to b47a2cf (merged main incl. the #6 emitter review fixes and #7 Generate-view landing) — the single pin bump confirming the dep matches merged main, as directed. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Copilot

Pull request overview

This PR mechanically lands the full “M1 stack” onto main (previously merged into stacked base branches), bringing in the surface-gates linter (S1–S3), generation adapters, the pipeline orchestrator + audit report artifact, the local serve endpoint, and the Playwright flagship demo gate.

Changes:

Adds surface gates S1–S3 (schema + vocabulary + rule evaluators) with golden fixtures and a dspack-gen lint CLI.
Introduces generation adapters (Ollama/Anthropic + deterministic scripted adapter) and the end-to-end pipeline orchestrator that produces audit report v1.
Adds dspack-gen serve plus a Playwright e2e gate and workflow wiring for CI.

Reviewed changes

Copilot reviewed 44 out of 46 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
vitest.config.ts	Configures Vitest to run unit/gate tests under `src/**`.
src/serve.ts	Adds localhost-only NDJSON streaming pipeline endpoint with basic hardening.
src/serve.test.ts	Tests CORS restrictions, body size cap, fake-mode behavior, and NDJSON sequence.
src/run/pipeline.test.ts	Deterministic acceptance tests for orchestrator success + failure paths and report validity.
src/run/orchestrator.ts	Implements generate→lint→repair→emit→validate pipeline and report construction.
src/repair/render.ts	Renders deterministic repair feedback from lint findings (ADR-7).
src/index.ts	Expands public root exports to include adapters/orchestrator/audit/repair.
src/core/surface-schema.ts	Vendors the surface v0.1 JSON schema used by gate S1.
src/core/lint/walk.ts	Adds surface tree traversal helpers shared by S2/S3.
src/core/lint/vocabulary.ts	Implements gate S2 (contract vocabulary validation).
src/core/lint/rules.ts	Implements gate S3 rule registry + evaluators and UnknownRuleTypeError.
src/core/lint/lint.test.ts	Adds golden-based acceptance tests for S1–S3 behavior and independence.
src/core/lint/index.ts	Wires S1–S3 linting with AJV and summarization.
src/core/lint/findings.ts	Defines Finding/LintReport shapes and deterministic text rendering.
src/core/index.ts	Exposes linting APIs from the `./core` subpath.
src/core/core-boundary.test.ts	Updates core-boundary scan to include recursive module traversal + allowed imports.
src/core/contract.ts	Adds duplicate sub-component id detection helper used by S2.
src/cli.ts	Adds `lint`, `run`, and `serve` commands; writes audit artifacts for `run`.
src/audit/report.ts	Adds audit report v1 types, hashing utilities, and markdown rendering.
src/adapters/types.ts	Defines adapter interface, typed adapter errors, and model-ref parsing.
src/adapters/ollama.ts	Implements Ollama adapter using structured outputs (`format`).
src/adapters/index.ts	Exports adapters and provides `adapterFor(modelRef)` factory.
src/adapters/fake.ts	Adds deterministic scripted adapter for tests/demo/e2e.
src/adapters/anthropic.ts	Implements Anthropic adapter using SDK structured outputs.
src/adapters/adapters.test.ts	Offline deterministic adapter tests with injected fetch fixtures.
scripts/smoke-ollama.ts	Adds non-CI live Ollama smoke script (generation + S1–S3).
schemas/audit-report.v1.schema.json	Introduces JSON schema for audit report v1 artifact.
playwright.config.ts	Adds Playwright config to run demo e2e against `serve` and sibling demo repo.
package.json	Adds pinned emitter dependency, Anthropic SDK, and Playwright scripts/deps.
package-lock.json	Locks new dependencies including git-pinned emitter and Playwright.
fixtures/golden/violating/F5-nested-interactive.expected.json	Adds golden expected report for F5 violating fixture.
fixtures/golden/violating/F5-nested-interactive.dsurface.json	Adds F5 violating surface fixture.
fixtures/golden/violating/F4-missing-title.expected.json	Adds golden expected report for F4 violating fixture.
fixtures/golden/violating/F4-missing-title.dsurface.json	Adds F4 violating surface fixture.
fixtures/golden/violating/F3-missing-cancel.expected.json	Adds golden expected report for F3 violating fixture.
fixtures/golden/violating/F3-missing-cancel.dsurface.json	Adds F3 violating surface fixture.
fixtures/golden/violating/F2-no-confirmation.expected.json	Adds golden expected report for F2 violating fixture.
fixtures/golden/violating/F2-no-confirmation.dsurface.json	Adds F2 violating surface fixture.
fixtures/golden/violating/F1-dialog-for-delete.expected.json	Adds golden expected report for F1 violating fixture.
fixtures/golden/violating/F1-dialog-for-delete.dsurface.json	Adds F1 violating surface fixture.
fixtures/golden/repair/F1.repair.txt	Adds golden repair-message text for F1.
fixtures/golden/clean/delete-account.dsurface.json	Adds clean golden surface fixture (worked example).
e2e/flagship.spec.ts	Adds Playwright flagship end-to-end test validating full UI trail + report schema.
docs/AUDIT.md	Documents audit report versioning and stability guarantees.
.gitignore	Updates ignores for out dirs and Playwright artifacts.
.github/workflows/test.yml	Adds CLI lint gate and a Playwright demo-e2e job that checks out sibling demo repo.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…s, spec-conformant F5 location Review fixes (Copilot on #8): - serve: body cap now counts BYTES (Buffer chunk lengths), not UTF-16 code units; `fake`/`noSteering` are strict booleans (a JSON "false" string no longer flips behavior or mis-records generation.ruleSteering); maxRepairs validated as a non-negative integer (400). - CLI: --max-repairs validated (NaN/negative previously reached the orchestrator and could produce a zero-attempt report). - core-boundary test splits on both path separators (cross-platform). - forbidden-composition: the SPEC (v0.3 §5.3, merged) locates the finding at the offending descendant — the evaluator did the opposite of its own comment and the spec. Now spec-conformant: located at the offender, the message names the matching origin node. F5 golden regenerated; the historical audit-report evidence in docs/evidence/ is untouched (it documents runs under the prior shape, as recorded). 59 tests + flagship e2e green. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

ryandmonk and others added 8 commits July 2, 2026 15:20

chore: bump pinned dspack-to-a2ui to e68a17e (surface root id fix)

748b142

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

chore: demo-e2e checks out dspack-to-a2ui@main (post #4/#5 merge)

814d425

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings July 2, 2026 21:34

Copilot started reviewing on behalf of ryandmonk July 2, 2026 21:34 View session

Copilot AI reviewed Jul 2, 2026

View reviewed changes

Comment thread src/serve.ts

Comment thread src/cli.ts

Comment thread src/core/core-boundary.test.ts

Comment thread src/core/lint/rules.ts

Comment thread src/serve.ts Outdated

Comment thread src/serve.ts

ryandmonk merged commit 691566a into main Jul 2, 2026
2 checks passed

ryandmonk deleted the feat/demo-e2e branch July 5, 2026 02:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Land the M1 stack on main (stacked PRs #2–#5 merged into their bases)#8

Land the M1 stack on main (stacked PRs #2–#5 merged into their bases)#8
ryandmonk merged 9 commits into
mainfrom
feat/demo-e2e

ryandmonk commented Jul 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ryandmonk commented Jul 2, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants