From 7e7c950fc1aea21782f5663637b6f1dce7bc36be Mon Sep 17 00:00:00 2001
From: anandgupta42 <anand@altimate.ai>
Date: Tue, 30 Jun 2026 14:21:14 -0700
Subject: [PATCH 1/2] feat: add `demo-creator` skill

Add `.claude/skills/demo-creator/`, a skill that turns a one-line problem
statement into a REAL recorded terminal demo plus a blog showing how to do it
with `altimate-code` and why that's worth it.

Highlights:
- Two modes: `showcase` (default, `altimate-code` alone) and optional
  `comparison` (vs `claude-code`); a baseline is not required.
- Two integrity laws: recordings reconcile with the session trace/`--format json`
  (no faking); value is shown, not asserted.
- Real-run recording via `asciinema` + `agg` (`record.sh`), with `vhs` as a
  scripted fallback; frame extraction + authenticity cross-check (`inspect.sh`).
- `extract_learnings.sh` mines a session into `learnings.md` (commands run,
  value-moment evidence, reusable lessons) feeding the blog and a lesson ledger.
- `build_artifact.sh` inlines clips as base64 and renders headless so the page
  can be visually inspected before publishing as a Claude artifact.
- Reuses `blog-writer`, `viral-tech-blog`, `humanizer`; `quality-bar.md` gate
  requires rendering and inspecting the published page, not just the clips.

Validated end-to-end by dogfooding it on a PySpark SQL-port demo (artifacts kept
out of this PR).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .claude/skills/demo-creator/SKILL.md          | 235 ++++++++++++++++++
 .../demo-creator/references/quality-bar.md    |  68 +++++
 .../demo-creator/references/recording.md      |  73 ++++++
 .../demo-creator/references/value-wedge.md    |  65 +++++
 .../scripts/baseline_vs_altimate.sh           |  80 ++++++
 .../demo-creator/scripts/build_artifact.sh    |  67 +++++
 .../demo-creator/scripts/extract_learnings.sh |  96 +++++++
 .../skills/demo-creator/scripts/inspect.sh    |  43 ++++
 .claude/skills/demo-creator/scripts/lib.sh    |  20 ++
 .claude/skills/demo-creator/scripts/record.sh |  56 +++++
 .../scripts/setup_pyspark_fixture.sh          |  62 +++++
 .../demo-creator/templates/blog.md.tmpl       |  52 ++++
 .../demo-creator/templates/demo.tape.tmpl     |  25 ++
 .../demo-creator/templates/replicate.md.tmpl  |  36 +++
 14 files changed, 978 insertions(+)
 create mode 100644 .claude/skills/demo-creator/SKILL.md
 create mode 100644 .claude/skills/demo-creator/references/quality-bar.md
 create mode 100644 .claude/skills/demo-creator/references/recording.md
 create mode 100644 .claude/skills/demo-creator/references/value-wedge.md
 create mode 100755 .claude/skills/demo-creator/scripts/baseline_vs_altimate.sh
 create mode 100755 .claude/skills/demo-creator/scripts/build_artifact.sh
 create mode 100755 .claude/skills/demo-creator/scripts/extract_learnings.sh
 create mode 100755 .claude/skills/demo-creator/scripts/inspect.sh
 create mode 100755 .claude/skills/demo-creator/scripts/lib.sh
 create mode 100755 .claude/skills/demo-creator/scripts/record.sh
 create mode 100755 .claude/skills/demo-creator/scripts/setup_pyspark_fixture.sh
 create mode 100644 .claude/skills/demo-creator/templates/blog.md.tmpl
 create mode 100644 .claude/skills/demo-creator/templates/demo.tape.tmpl
 create mode 100644 .claude/skills/demo-creator/templates/replicate.md.tmpl

diff --git a/.claude/skills/demo-creator/SKILL.md b/.claude/skills/demo-creator/SKILL.md
new file mode 100644
index 000000000..c4898f3e4
--- /dev/null
+++ b/.claude/skills/demo-creator/SKILL.md
@@ -0,0 +1,235 @@
+---
+name: demo-creator
+description: |
+  Turn a one-line problem statement into a REAL recorded terminal demo (a series of
+  small clips) plus a blog post showing HOW to do it with altimate-code and WHY that's
+  worth it. Use when the user asks to "make a demo", "record a demo", "create a demo +
+  blog", or "show off capability X" for altimate-code. Pipeline: deep research the real
+  pain -> critique -> design -> record a genuine `altimate-code run` -> extract what the
+  session teaches -> visually inspect the frames -> write the blog -> critique loop until
+  it clears the bar. Runs in showcase mode (altimate-code alone) by default, or comparison
+  mode (vs claude-code) when a head-to-head sharpens the point. Always ships "replicate it
+  yourself" steps.
+license: MIT
+compatibility: claude-code opencode
+allowed-tools:
+  - Read
+  - Write
+  - Edit
+  - Glob
+  - Grep
+  - Bash
+  - WebFetch
+  - WebSearch
+  - AskUserQuestion
+  - Skill
+  - Agent
+  - Task
+---
+
+# Demo Creator: real demos that prove why altimate-code is better
+
+You produce **demos people believe** and **blogs that earn the click**. A demo is worth
+nothing if it's staged, and a blog is worth nothing if it lists features instead of
+showing a win. This skill forces both to be real and both to be earned.
+
+The output is usually **a series of small, tight clips** — one value-moment each — not one
+long recording. Small clips are easier to record cleanly, easier to embed, and easier for
+a reader to actually watch.
+
+## The two laws (do not break these)
+
+1. **Recordings are of real runs.** You record an actual `altimate-code run`, never typed-up
+   fake output. After recording, verify what the clip shows against the session trace
+   (`altimate-code trace view <id>`) and/or the `--format json` event stream. If the clip
+   shows something the run didn't do, the clip is invalid — re-record.
+2. **Value is shown, not asserted.** The "why use altimate-code" must come from what the
+   recording actually shows the tool doing — never from adjectives. Don't narrate a benefit
+   the clip doesn't demonstrate. If, when you watch the real run, altimate-code didn't
+   actually do the valuable thing, you have no demo yet: change the task, or report it
+   honestly. We are selling trust; a staged or overclaimed demo destroys it.
+
+If you ever feel pressure to fudge either law to "make the demo look good," stop and tell
+the user the honest result instead.
+
+## Two modes — baseline is OPTIONAL
+
+Pick per angle (you can mix across a series):
+
+- **Showcase mode (default).** Just show how altimate-code does the task, and let the
+  *specific thing it does* — runs the SQL and diffs it, runs `check`, traces lineage,
+  proves equivalence — be the reason to use it. No second agent. This is the right mode
+  when the capability is self-evidently valuable on its own (e.g. "it verified the output,"
+  "it caught the defect"). The viewer sees the value directly; you don't need a foil.
+- **Comparison mode (optional).** Add a **claude-code** (`claude -p`) control on the same
+  prompt/fixture/model when a head-to-head genuinely sharpens the point — i.e. the value is
+  "altimate-code does X that a general agent skips." Only use it when altimate-code actually
+  comes out ahead on the real runs; if it ties or loses, drop the comparison (you may still
+  ship the showcase) and say so. An in-fork capability-disabled control is a fallback — see
+  value-wedge.md.
+
+Default to showcase. Reach for comparison only when the contrast is the point and it's real.
+
+## When to use / not use
+
+**Use it** when the user wants a demo and/or blog that shows an altimate-code capability on
+a real use case.
+
+**Don't use it** for: pure docs/READMEs, slide decks, or marketing copy with no running
+software behind it. Those don't need a recorded real run.
+
+## Inputs
+
+- A **problem statement** (e.g. *"a good demo on building spark pipelines using
+  altimate-code and pyspark"*). If missing, ask for it.
+- Optional: target audience (practitioner vs exec), where artifacts should go, whether one
+  clip or a series.
+
+## Output layout
+
+Create `demos/<topic-slug>/` at the repo root:
+
+```
+demos/<topic-slug>/
+  research.md          # phase 1: real pains + verbatim quotes + sources
+  demo-spec.md         # phase 3: angles, fixtures, prompts, baseline controls
+  fixture/             # phase 4: the real use-case project the agent operates on
+  clips/
+    <angle>.cast       # asciinema recording of the REAL run
+    <angle>.gif        # GIF rendered from that exact cast (agg)
+    <angle>.baseline.cast / .gif   # ONLY in comparison mode
+    frames/<angle>/*.png           # extracted frames for visual inspection
+  runs/<angle>.json    # --format json event stream (authenticity proof)
+  learnings.md         # phase 6.5: what each session actually did + taught (mined from the trace)
+  blog.md              # phase 7: the post, clips embedded
+  demo.html            # phase 7.5: self-contained page published as a Claude artifact
+  REPLICATE.md         # exact steps for a reader to reproduce it
+  MANIFEST.md          # phase 9: index of everything + trace ids
+```
+
+The blog markdown is ALSO copied to the Obsidian vault per the user's global rule:
+`/Users/anandgupta/obsidian/altimate/Research/<topic>/<Title>.md`.
+
+## Workflow
+
+Track these as TaskCreate items so progress is visible. Each phase has a gate — don't
+advance until it passes.
+
+### Phase 0 — Frame
+- Slugify the topic. Decide single clip vs **series of 2–4** (default: series).
+- Create the output dir. Write down the *claim* you intend to prove in one sentence.
+
+### Phase 1 — Deep research (grounded in real pain)
+- Use the `deep-research` skill or `parallel:parallel-web-search` (fallback: WebSearch +
+  WebFetch). Mine Reddit (r/dataengineering etc.), Hacker News, Stack Overflow, eng blogs.
+- For each pain capture: a label, **1–2 verbatim quotes (<25 words) with source URLs**,
+  a rough commonness signal, and a concrete scenario.
+- Write `research.md`. **Gate:** at least one pain with a real, traceable quote.
+- For a long/independent search, spawn a background `Agent` so you can author in parallel.
+
+### Phase 2 — Critique the research (honesty gate)
+- For each pain, name the **specific** altimate-code capability that addresses it (a real
+  tool/skill — e.g. `sql_analyze`, `altimate_core_rewrite` w/ `verify_equivalence`,
+  dbt lineage/cost skills, `check`, warehouse tools). Read `references/value-wedge.md`.
+- Drop any pain where we don't genuinely win, or where the help is really infra/ops, not a
+  terminal coding agent. Keep the **2–4 strongest "we win here"** angles.
+- **Gate:** if nothing genuinely wins, STOP and report that honestly. Don't fabricate.
+
+### Phase 3 — Design the demo(s)
+- For each angle write, in `demo-spec.md`: the real fixture, the exact prompt to
+  `altimate-code run`, the **value moment** (the single frame that makes the point), and the
+  **mode** — showcase (default) or comparison. Only choose comparison if a `claude -p`
+  control genuinely sharpens the point and altimate-code is expected to win; see
+  value-wedge.md.
+- Keep each clip scoped to ONE moment. If it needs a paragraph to explain, split it.
+
+### Phase 4 — Build the real fixture
+- A small but genuine project with a real bug/gap/decision the agent resolves. Not a toy
+  with the answer pre-written. For PySpark, `scripts/setup_pyspark_fixture.sh` bootstraps a
+  `local[*]` env (real execution, no cluster).
+- **Gate:** the fixture runs and actually exhibits the problem before the agent touches it.
+
+### Phase 5 — Record (real runs)
+- **Showcase mode:** `scripts/record.sh <topic> <angle> --cwd <fixture> --banner '<prompt>'
+  -- altimate-code run --yolo --trace '<prompt>'` records the real run with **asciinema** →
+  `.cast`, then renders to `.gif` with **agg**. Capture the json too:
+  `altimate-code run --format json … > runs/<angle>.json` (authenticity proof). `vhs` is the
+  optional scripted-polish fallback (see `references/recording.md`).
+- **Comparison mode (optional):** also run `scripts/baseline_vs_altimate.sh <topic> <angle>
+  --cwd <fixture> --prompt '...' --reset-from <pristine>` to capture the `claude -p` control
+  on the same prompt/model, and record its clip too.
+- **Gate:** `.cast` + `.gif` + `runs/<angle>.json` exist for the altimate run (plus the
+  baseline variant in comparison mode).
+
+### Phase 6 — Visual inspection (use your eyes)
+- `scripts/inspect.sh <topic> <angle>` extracts key PNG frames. **Read them** and confirm:
+  legible at embed size; the value moment is visible; no stack-trace garbage or dead air;
+  timing is watchable. In comparison mode, also confirm the baseline genuinely struggles
+  where altimate wins.
+- Cross-check the frames against `runs/<angle>.json` / `trace view` — nothing shown may be
+  absent from the real run. **Gate:** if any check fails, fix the fixture/tape and
+  re-record. Loop here, don't paper over it in the blog.
+- **If you also build an HTML demo page or publish an artifact, RENDER IT and look** —
+  `chrome --headless=new --screenshot --window-size=W,H file://…`, then Read the PNG.
+  Inspecting the clips is not inspecting the page; layout bugs (empty grid cells, GIFs that
+  are mostly empty terminal) only show in the rendered page. See `quality-bar.md`.
+
+### Phase 6.5 — Learn from the session
+- Run `scripts/extract_learnings.sh <topic> <angle>` to mine the recorded session
+  (`runs/<angle>.json` + the trace + the run log) into `learnings.md`: the tool/command
+  sequence altimate-code actually ran, the value moment(s), anything surprising (a defect it
+  caught, a bug it hit), and a one-line "why this matters." This is sourced from the REAL
+  run, so it's both demo material AND a check on law #2 — if the "value moment" isn't in the
+  extracted sequence, you don't have it.
+- Two outputs from the learnings:
+  1. **Into the blog** (phase 7): the "what it actually did" beat comes straight from here,
+     not from your memory of the run.
+  2. **Back into the skill / memory:** if the session taught something reusable — a product
+     bug, an env gotcha, a prompt that reliably triggers the value behavior, a fixture
+     pattern that works — append it to `MANIFEST.md`'s ledger and, if it should persist
+     across sessions, save it to memory. Each demo run makes the next one sharper.
+
+### Phase 7 — Write the blog
+- Invoke `blog-writer` (structure/voice) and `viral-tech-blog` (hook/title/shareability).
+  Use `templates/blog.md.tmpl` as the skeleton: open with the researched pain + a real
+  quote, then **show what altimate-code did** (sourced from `learnings.md`) with the clip,
+  and name the one capability that made it valuable. In comparison mode, add the
+  baseline → after contrast; in showcase mode, the value stands on its own. Close with
+  **Replicate it yourself** (exact commands — also write `REPLICATE.md`).
+- Embed clips (GIF inline; link the `.cast` for copy-paste). Save `blog.md`, then copy to
+  the Obsidian vault path above.
+
+### Phase 7.5 — Publish as a Claude artifact
+- **Load the `artifact-design` skill first** (required before authoring the page), and design
+  a real visual identity — don't ship the generic dark-terminal / mono-headline default.
+- Author the page as an HTML template (the blog content + the clips as
+  `<img src="clips/<angle>.gif">`). Then make it self-contained:
+  `scripts/build_artifact.sh <topic> <template.html> --render` inlines every clip as a base64
+  data URI (the artifact CSP blocks external hosts) → writes `demo.html` and renders a
+  screenshot.
+- **RENDER-INSPECT before publishing** (this is mandatory, not optional — see the artifact
+  checklist in `quality-bar.md`): Read the rendered PNG and confirm no empty grid cells, no
+  mostly-empty GIFs, no horizontal scroll, legible type. Fix and rebuild until clean.
+- Publish `demo.html` with the **Artifact** tool. On later edits, redeploy to the **same**
+  file path so it keeps the same URL; keep the favicon stable across redeploys. Record the
+  artifact URL in MANIFEST.md.
+
+### Phase 8 — Critique loop
+- Run the `humanizer` skill (kill AI-slop) and self-critique against
+  `references/quality-bar.md`. Optionally run `consensus:review` on the draft.
+- Loop back to the phase that failed (re-research, re-record, rewrite) until the bar is met
+  or the budget is spent. **State residual weaknesses honestly** in MANIFEST.md.
+
+### Phase 9 — Deliver
+- Write `MANIFEST.md`: every clip + cast + frame dir, the blog (local + vault paths),
+  REPLICATE.md, research sources, and the **trace ids** for each recorded run (so anyone
+  can re-verify authenticity). Summarize what was proven and what wasn't.
+
+## References (read when you reach the relevant phase)
+- `references/value-wedge.md` — finding the real "why us" and designing an honest baseline.
+- `references/recording.md` — asciinema/agg/vhs mechanics, Wait-for-completion, frames.
+- `references/quality-bar.md` — the gate rubric for the demo and the blog.
+
+## Reused skills (invoke, don't reimplement)
+`deep-research` / `parallel:parallel-web-search` (phase 1), `blog-writer` + `viral-tech-blog`
+(phase 7), `humanizer` + `consensus:review` (phase 8).
diff --git a/.claude/skills/demo-creator/references/quality-bar.md b/.claude/skills/demo-creator/references/quality-bar.md
new file mode 100644
index 000000000..ef85d2c2c
--- /dev/null
+++ b/.claude/skills/demo-creator/references/quality-bar.md
@@ -0,0 +1,68 @@
+# Quality bar — the gate for demo + blog
+
+Score each item. If any **must-pass** fails, loop back and fix; do not ship.
+
+## The demo (clip)
+
+Must-pass (both modes):
+- [ ] **Real.** The clip reconciles with the session trace / `--format json`. Nothing shown
+      is absent from the real run.
+- [ ] **Value is shown.** altimate-code visibly *does* the valuable thing on camera (runs the
+      diff, flags the defect, proves equivalence) — not narrated, not merely claimed. It's in
+      `learnings.md`'s value-moment evidence, or you don't have it.
+- [ ] **Legible.** Text is readable at the size it'll embed; no clipped lines.
+- [ ] **One moment.** A single freezable frame makes the point.
+
+Must-pass (comparison mode only):
+- [ ] **Fair baseline.** claude-code (`claude -p`), same model + prompt + fixture; the only
+      difference is altimate-code itself. Baseline failure is the *natural* result, not a
+      strawman (no sabotaged prompt).
+- [ ] **altimate actually comes out ahead.** Visible, not narrated. If it ties or loses, drop
+      the comparison and ship showcase-only (or report it) — see value-wedge.md.
+
+Should-pass:
+- [ ] ≤ ~45s of watchable content; dead air trimmed (idle compression only, no faking).
+- [ ] The value moment lands in the first half of the clip.
+
+## The blog
+
+Must-pass (anti-slop — the `humanizer` skill enforces most of these):
+- [ ] Opens with a **real pain + a real quote** (sourced), not "In today's data landscape…".
+- [ ] **Shows, doesn't claim.** Showcase: the clip shows altimate-code doing the valuable
+      thing. Comparison: the baseline → after contrast is in the clip. Either way, demonstrated.
+- [ ] Names the **specific capability** that made the difference (not "AI magic").
+- [ ] Has a **Replicate it yourself** section a reader could actually follow to the same
+      result (exact commands; matches REPLICATE.md / the fixture scripts).
+- [ ] **Honest about limits.** Says where it wouldn't help / what the baseline got right.
+- [ ] No feature-list dump, no rule-of-three filler, no "delve/leverage/seamless", minimal
+      em-dashes, no fabricated metrics.
+
+Should-pass (the `viral-tech-blog` layer):
+- [ ] A title that states the surprising result, not the topic.
+- [ ] One quotable stat or coined phrase a reader could repeat.
+- [ ] A first line that hooks (the surprising claim or the pain), not a throat-clear.
+- [ ] An ending that invites discussion (a question, a tradeoff, a "where this breaks").
+
+## The rendered page / HTML artifact (if you build one)
+Inspecting the clips is NOT inspecting the page. If you produce an HTML demo page or
+publish an artifact, you MUST render the actual page and look at it:
+- [ ] Render it (`chrome --headless=new --screenshot --window-size=W,H file://…`) and Read
+      the PNG. Don't trust that the markup "looks right" in source.
+- [ ] No empty/dead layout cells — a common bug is an N-column grid with content in only
+      one cell (check every `grid-template-columns` has a child per column).
+- [ ] Embedded GIFs aren't mostly empty space. A static screenshot shows GIF frame 1
+      (often a near-empty terminal); that's an artifact of the capture, but ALSO re-render
+      the GIF at a row count that matches its real content so the live clip isn't 80% void
+      (e.g. `claude -p` print-mode output is tiny — crop it with `agg --rows`).
+- [ ] Page body never scrolls sideways; wide blocks have their own `overflow-x:auto`.
+- [ ] Self-contained: assets embedded as data URIs (artifact CSP blocks external hosts).
+
+## Series-level (when shipping multiple clips)
+- [ ] Each clip is independent and proves a distinct capability.
+- [ ] Together they tell one coherent "this is why altimate-code" story.
+- [ ] No clip repeats another's value moment.
+
+## Honesty ledger (always write into MANIFEST.md)
+- What was proven, with which trace ids.
+- What was NOT proven / where altimate tied or lost.
+- Any idle-trimming or edits applied to clips (and confirmation no output was fabricated).
diff --git a/.claude/skills/demo-creator/references/recording.md b/.claude/skills/demo-creator/references/recording.md
new file mode 100644
index 000000000..c5ac11354
--- /dev/null
+++ b/.claude/skills/demo-creator/references/recording.md
@@ -0,0 +1,73 @@
+# Recording mechanics: asciinema + agg (primary), vhs (fallback)
+
+## Why this stack
+
+- **asciinema** records the **real pty session** — including when you run it inside iTerm.
+  Output is a `.cast` (JSON-lines of timing + bytes): copy-pasteable, replayable, and a
+  faithful record of what actually happened. This is the authenticity anchor.
+- **agg** renders a `.cast` to an animated `.gif`. Crucially it renders the **captured real
+  run**, so the GIF is the real session — not a re-execution.
+- **vhs** is a *scripted* recorder: it re-runs the command in its own synthetic terminal.
+  Use it only when you want a perfectly clean, deterministic "polished" version and you've
+  already proven the real behavior with an asciinema cast. Never let vhs output be the only
+  evidence a thing worked.
+
+Install once: `brew install asciinema agg` (vhs already present: `brew install vhs`).
+
+## Primary path — record a real run
+
+`scripts/record.sh` wraps this. The essence:
+
+```bash
+# Record the real command. asciinema runs it and captures everything.
+asciinema rec --overwrite --command "<the real command>" clips/<angle>.cast
+# Render that exact cast to a GIF.
+agg --cols 100 --rows 30 --font-size 22 clips/<angle>.cast clips/<angle>.gif
+```
+
+Keep the terminal small (≈100×30) so the GIF is legible when embedded. Set a readable
+font size. For long agent runs, prefer non-interactive `altimate-code run --yolo` so the
+session ends on its own (asciinema stops when the command exits).
+
+### Handling long / nondeterministic agent runs
+- Use `altimate-code run --yolo --max-turns N` so it can't run away.
+- Trim dead air after the fact if needed: edit the `.cast` (it's just timed events) or use
+  `asciinema` quantize-style trimming; agg has `--idle-time-limit` to cap long pauses:
+  `agg --idle-time-limit 2 ...` collapses gaps >2s. This keeps clips watchable without
+  faking anything — it only compresses waiting, not output.
+
+## Fallback path — vhs (scripted polish)
+
+A `.tape` drives a synthetic terminal. Useful patterns (vhs ≥0.11 supports these):
+
+```
+Require altimate-code
+Output clips/<angle>.gif
+Set FontSize 22
+Set Width 1000
+Set Height 600
+Type "altimate-code run --yolo 'PROMPT'"  Enter
+Wait@180s /Done|✔|session ended/        # wait for a completion marker, not a fixed sleep
+Screenshot clips/frames/<angle>/value-moment.png
+Sleep 2s
+```
+
+`Wait@<timeout> /regex/` blocks until output matches — essential because agent runs vary in
+length. `Screenshot` grabs a frame at that instant (no ffmpeg needed). Because vhs re-runs
+the command, treat its GIF as illustrative and keep the asciinema cast as the real record.
+
+## Extracting frames for visual inspection (`scripts/inspect.sh`)
+
+From a GIF:
+```bash
+ffmpeg -i clips/<angle>.gif -vf "fps=1" clips/frames/<angle>/f_%03d.png   # 1 frame/sec
+```
+Then `Read` the PNGs to verify legibility and the value moment. Also dump a few from the
+baseline GIF so you can eyeball the contrast side by side.
+
+## Authenticity cross-check
+After recording, reconcile the clip with the real run:
+- `altimate-code trace list` → find the session; `altimate-code trace view <id>`.
+- Or read `runs/<angle>.json` (the `--format json` event stream).
+The tools the clip appears to call, and the result it shows, must be present there. If the
+GIF shows a verified-equivalence badge, the json/trace must contain that verification event.
diff --git a/.claude/skills/demo-creator/references/value-wedge.md b/.claude/skills/demo-creator/references/value-wedge.md
new file mode 100644
index 000000000..963d582fe
--- /dev/null
+++ b/.claude/skills/demo-creator/references/value-wedge.md
@@ -0,0 +1,65 @@
+# Finding the wedge (and an optional, honest baseline)
+
+The whole demo hinges on one question: **what does altimate-code do here that's worth it —
+and can we make a viewer SEE that in 20 seconds?** Sometimes the clearest way to show it is
+altimate-code alone (showcase); sometimes it's a head-to-head with a generic agent
+(comparison). Decide per angle; default to showcase.
+
+## Step 1 — Map pain → a specific capability
+
+A vague "it's an AI agent for data" is not a wedge. Tie each researched pain to a concrete,
+named altimate-code capability that a generic agent lacks or does worse. Examples of real
+capabilities to reach for (verify the tool/skill exists before you claim it):
+
+- **SQL/transform equivalence** — `altimate_core_rewrite` with `verify_equivalence`,
+  `altimate_core_equivalence`. Proves a rewrite didn't change results. Generic agents just
+  *assert* equivalence.
+- **Deterministic SQL checks** — `altimate-code check` (lint / validate / safety / policy /
+  pii, no LLM). Catches issues a chat agent eyeballs and misses.
+- **dbt-native understanding** — lineage, impact, cost-attribution, scheduling skills;
+  compiled-SQL awareness (not regex over Jinja).
+- **Warehouse cost intelligence** — finops skills, dry-run cost, full-scan/spill detection.
+- **Multi-warehouse schema awareness** — real schema introspection vs guessing column names.
+
+If the pain you found doesn't map to one of these (or another real, demonstrable
+capability), it is **not a demo angle**. Cut it. A demo of a generic capability any agent
+has tells the viewer nothing about why to choose us.
+
+## Step 2 — Pick the mode
+
+**Showcase (default).** Show altimate-code doing the task, and let the specific thing it
+does carry the value: it runs the SQL and diffs the output, it runs `check` and flags a real
+defect, it traces lineage and shows the blast radius, it proves equivalence. The viewer sees
+the value directly. No second agent. Use this whenever the capability is self-evidently
+useful on its own — which is most of the time.
+
+**Comparison (optional).** Add a **claude-code** (`claude -p`) control on the same
+prompt/fixture/model only when the value *is* the contrast — "a general agent skips this
+step; altimate-code doesn't." It answers "I already have Claude Code; why switch?" but it
+costs you a fair-test burden (below) and only works if altimate-code actually comes out
+ahead. If unsure, start showcase; add the comparison only if the contrast turns out real.
+
+If you do compare, keep it FAIR and ATTRIBUTABLE: same model, same prompt, same fixture, so
+the only difference is altimate-code. Hold the model constant (`--model` / `--baseline-model`).
+A fallback control is **capability-disabled in-fork** (`--baseline-mode disable --baseline-deny
+<tool>`), useful to isolate one specific tool. Never give the baseline a worse model, vaguer
+prompt, or broken env — a rigged contrast is worse than no contrast.
+
+## Step 3 — The honesty gate
+
+Watch the real run(s). Then:
+
+- **Showcase:** did altimate-code actually do the valuable thing on camera (the diff ran, the
+  defect was flagged, equivalence was proved)? If yes → ship. If it only *claimed* to, or
+  didn't get there → no demo yet; change the task or report honestly.
+- **Comparison:** altimate clearly ahead → ship the contrast. Tie/marginal → drop the
+  comparison (you may still ship the showcase) and say so. Baseline wins/altimate regresses →
+  do not ship a contrast; report it — a real product finding beats a fake demo.
+
+## Step 4 — Name the value moment
+
+Every clip needs ONE frame you could freeze and put on a slide: equivalence VERIFIED; `check`
+flagging a real defect; the SQL-vs-output `diff` coming back zero; lineage showing the blast
+radius. In showcase mode that frame stands alone; in comparison mode it's the frame the
+baseline never produces. Design the prompt so that moment happens on camera. If you can't
+name the value moment, the demo isn't ready.
diff --git a/.claude/skills/demo-creator/scripts/baseline_vs_altimate.sh b/.claude/skills/demo-creator/scripts/baseline_vs_altimate.sh
new file mode 100755
index 000000000..fc67ed85c
--- /dev/null
+++ b/.claude/skills/demo-creator/scripts/baseline_vs_altimate.sh
@@ -0,0 +1,80 @@
+#!/usr/bin/env bash
+# Run the SAME task twice and capture each run for comparison:
+#   baseline  = claude-code (the base agent: `claude -p`)         [default]
+#   altimate  = altimate-code (the fork with data tooling)
+# Same prompt, same fixture, same underlying model -> the only difference is altimate-code's
+# additions. This is the honest "why use us over plain Claude Code" control.
+#
+# It captures the json event stream for each run (authenticity proof). It does NOT render
+# GIFs — use record.sh to capture the watchable clip of whichever run(s) you show.
+#
+# Usage:
+#   baseline_vs_altimate.sh <topic> <angle> --cwd DIR --prompt "..." \
+#       [--reset-from PRISTINE_DIR] [--max-turns N] \
+#       [--model PROVIDER/MODEL]      # altimate-code model (e.g. anthropic/claude-...)
+#       [--baseline-model ALIAS]      # claude-code model alias (default: sonnet)
+#       [--baseline-mode claude|disable] [--baseline-deny TOOL,TOOL] [--baseline-agent NAME]
+set -euo pipefail
+DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"; source "$DIR/lib.sh"
+
+TOPIC=""; ANGLE=""; CWD="$PWD"; PROMPT=""; MODEL=""; BASE_MODEL="sonnet"; MAXT="40"
+RESET_FROM=""; BASE_MODE="claude"; BASE_DENY=""; BASE_AGENT=""
+while [ $# -gt 0 ]; do
+  case "$1" in
+    --cwd) CWD="$2"; shift 2;;
+    --prompt) PROMPT="$2"; shift 2;;
+    --model) MODEL="$2"; shift 2;;
+    --baseline-model) BASE_MODEL="$2"; shift 2;;
+    --reset-from) RESET_FROM="$2"; shift 2;;
+    --max-turns) MAXT="$2"; shift 2;;
+    --baseline-mode) BASE_MODE="$2"; shift 2;;     # claude (default) | disable
+    --baseline-deny) BASE_DENY="$2"; shift 2;;     # only for --baseline-mode disable
+    --baseline-agent) BASE_AGENT="$2"; shift 2;;   # only for --baseline-mode disable
+    -*) echo "unknown opt: $1" >&2; exit 2;;
+    *) if [ -z "$TOPIC" ]; then TOPIC="$1"; elif [ -z "$ANGLE" ]; then ANGLE="$1"; fi; shift;;
+  esac
+done
+[ -n "$TOPIC" ] && [ -n "$ANGLE" ] && [ -n "$PROMPT" ] || {
+  echo "usage: baseline_vs_altimate.sh <topic> <angle> --cwd DIR --prompt '...' [...]" >&2; exit 2; }
+need altimate-code
+
+DD="$(demo_dir "$TOPIC")"; RUNS="$DD/runs"; mkdir -p "$RUNS"
+
+reset_fixture() {
+  [ -n "$RESET_FROM" ] || return 0
+  [ -d "$RESET_FROM" ] || { echo "ERROR: --reset-from '$RESET_FROM' is not a dir" >&2; return 1; }
+  log "  resetting fixture: rsync $RESET_FROM/ -> $CWD/"
+  rsync -a --delete --exclude '.git' "$RESET_FROM"/ "$CWD"/
+}
+
+# ---------------- baseline ----------------
+log "=== BASELINE run ($BASE_MODE) ==="
+reset_fixture
+if [ "$BASE_MODE" = "claude" ]; then
+  need claude
+  ( cd "$CWD" && claude -p --dangerously-skip-permissions --max-turns "$MAXT" \
+      --output-format json --model "$BASE_MODEL" "$PROMPT" ) \
+      | tee "$RUNS/$ANGLE.baseline.json" || true
+else
+  # In-fork isolation: run altimate-code but strip the capability under test.
+  B_ARGS=(); [ -n "$BASE_AGENT" ] && B_ARGS+=(--agent "$BASE_AGENT")
+  M_ARGS=(); [ -n "$MODEL" ] && M_ARGS+=(--model "$MODEL")
+  if [ -n "$BASE_DENY" ]; then
+    export OPENCODE_PERMISSION="{\"tools\":{$(echo "$BASE_DENY" | awk -F, '{for(i=1;i<=NF;i++){printf "%s\"%s\":\"deny\"", (i>1?",":""), $i}}')}}"
+    log "  baseline OPENCODE_PERMISSION=$OPENCODE_PERMISSION"
+  fi
+  ( cd "$CWD" && altimate-code run --yolo --max-turns "$MAXT" --format json --trace \
+      "${M_ARGS[@]}" "${B_ARGS[@]}" "$PROMPT" ) | tee "$RUNS/$ANGLE.baseline.json" || true
+  unset OPENCODE_PERMISSION
+fi
+
+# ---------------- altimate ----------------
+log "=== ALTIMATE run ==="
+reset_fixture
+M_ARGS=(); [ -n "$MODEL" ] && M_ARGS+=(--model "$MODEL")
+( cd "$CWD" && altimate-code run --yolo --max-turns "$MAXT" --format json --trace \
+    "${M_ARGS[@]}" "$PROMPT" ) | tee "$RUNS/$ANGLE.json" || true
+
+log "json captured: $RUNS/$ANGLE.baseline.json  +  $RUNS/$ANGLE.json"
+log "fair-comparison check: baseline model='$BASE_MODEL', altimate model='${MODEL:-<default>}' — keep these equivalent."
+log "trace ids: 'altimate-code trace list' (altimate run is traced; the claude run leaves its own transcript)."
diff --git a/.claude/skills/demo-creator/scripts/build_artifact.sh b/.claude/skills/demo-creator/scripts/build_artifact.sh
new file mode 100755
index 000000000..23e90ff68
--- /dev/null
+++ b/.claude/skills/demo-creator/scripts/build_artifact.sh
@@ -0,0 +1,67 @@
+#!/usr/bin/env bash
+# Build a SELF-CONTAINED HTML artifact from a page template by inlining every relative
+# image src (clips/*.gif, frames/*.png, …) as a base64 data URI. Claude artifacts run under
+# a strict CSP that blocks external hosts, so assets MUST be embedded. Optionally renders the
+# result headless so you can visually inspect it BEFORE publishing (do not skip that).
+#
+# Usage:
+#   build_artifact.sh <topic> <template.html> [--out demo.html] [--render] [--width 820] [--height 5200]
+# Then publish the written file with the Artifact tool (redeploy to the same path/URL on edits).
+set -euo pipefail
+DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"; source "$DIR/lib.sh"
+
+[ $# -ge 2 ] || { echo "usage: build_artifact.sh <topic> <template.html> [--out NAME] [--render] [--width W] [--height H]" >&2; exit 2; }
+TOPIC="$1"; TPL="$2"; shift 2
+OUT="demo.html"; RENDER=0; W=820; H=5200
+while [ $# -gt 0 ]; do case "$1" in
+  --out) OUT="$2"; shift 2;;
+  --render) RENDER=1; shift;;
+  --width) W="$2"; shift 2;;
+  --height) H="$2"; shift 2;;
+  *) echo "unknown opt: $1" >&2; exit 2;;
+esac; done
+
+[ -f "$TPL" ] || { echo "ERROR: template '$TPL' not found" >&2; exit 1; }
+DD="$(demo_dir "$TOPIC")"; OUTPATH="$DD/$OUT"
+log "building self-contained artifact: $TPL -> $OUTPATH (base dir $DD)"
+
+python3 - "$TPL" "$DD" "$OUTPATH" <<'PY'
+import base64, mimetypes, os, re, sys
+tpl, base, out = sys.argv[1:4]
+html = open(tpl, encoding="utf-8").read()
+mimetypes.add_type("image/svg+xml", ".svg")
+inlined = [0]; missing = []
+def datauri(rel):
+    p = os.path.normpath(os.path.join(base, rel))
+    if not (p.startswith(os.path.abspath(base)) and os.path.isfile(p)):
+        missing.append(rel); return None
+    mime = mimetypes.guess_type(p)[0] or "application/octet-stream"
+    inlined[0] += 1
+    return "data:%s;base64,%s" % (mime, base64.b64encode(open(p,"rb").read()).decode())
+def repl(m):
+    pre, q, url, post = m.group(1), m.group(2), m.group(3), m.group(4)
+    if re.match(r'^(https?:|data:|//)', url): return m.group(0)
+    d = datauri(url)
+    return m.group(0) if d is None else "%s%s%s%s" % (pre, q, d, post)
+# src="..." and CSS url(...) for local image paths
+html = re.sub(r'(src=)(["\'])([^"\']+\.(?:gif|png|jpe?g|svg|webp))(["\'])', repl, html, flags=re.I)
+html = re.sub(r'(url\()(["\']?)([^)"\']+\.(?:gif|png|jpe?g|svg|webp))(["\']?\))', repl, html, flags=re.I)
+open(out, "w", encoding="utf-8").write(html)
+print("inlined %d asset(s); %.2f MB" % (inlined[0], len(html)/1e6))
+if missing: print("WARNING: could not inline (will break under CSP): %s" % ", ".join(missing), file=sys.stderr)
+PY
+
+if [ "$RENDER" = "1" ]; then
+  CHROME="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
+  SHOT="$DD/clips/frames/_artifact_render.png"; mkdir -p "$(dirname "$SHOT")"
+  if [ -x "$CHROME" ]; then
+    "$CHROME" --headless=new --disable-gpu --hide-scrollbars --force-device-scale-factor=1 \
+      --window-size="$W,$H" --screenshot="$SHOT" "file://$OUTPATH" 2>/dev/null
+    log "rendered for inspection -> $SHOT  (Read it; check empty grid cells, mostly-empty GIFs, no h-scroll)"
+    echo "$SHOT"
+  else
+    log "Chrome not found; render skipped. Open $OUTPATH manually to inspect before publishing."
+  fi
+fi
+log "artifact ready: $OUTPATH  (publish with the Artifact tool; redeploy to the same path on edits)"
+echo "$OUTPATH"
diff --git a/.claude/skills/demo-creator/scripts/extract_learnings.sh b/.claude/skills/demo-creator/scripts/extract_learnings.sh
new file mode 100755
index 000000000..0f33c5099
--- /dev/null
+++ b/.claude/skills/demo-creator/scripts/extract_learnings.sh
@@ -0,0 +1,96 @@
+#!/usr/bin/env bash
+# Mine a recorded altimate-code session into learnings.md: what the run actually DID
+# (tool/command sequence), the verification/value markers it hit, and raw material for the
+# "what it did" beat in the blog + the cross-session learning ledger.
+#
+# It only extracts evidence from the real run — it does not invent takeaways. The agent
+# reads learnings.md and writes the one-line "why this matters" + any reusable lesson.
+#
+# Usage: extract_learnings.sh <topic> <angle> [--baseline]   (--baseline mines the claude run too)
+DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"; source "$DIR/lib.sh"
+# Best-effort miner: grep no-matches and head-closed pipes are expected, so don't let
+# lib.sh's `set -euo pipefail` abort us on them.
+set +e +o pipefail
+
+[ $# -ge 2 ] || { echo "usage: extract_learnings.sh <topic> <angle> [--baseline]" >&2; exit 2; }
+TOPIC="$1"; ANGLE="$2"; WITH_BASE=0
+[ "${3:-}" = "--baseline" ] && WITH_BASE=1
+
+DD="$(demo_dir "$TOPIC")"
+JSON="$DD/runs/$ANGLE.json"
+LOG="$DD/runs/$ANGLE.log"; [ -f "$LOG" ] || LOG="$DD/runs/altimate.log"
+OUT="$DD/learnings.md"
+
+# Pull the source the run actually produced. Prefer json; fall back to the captured log.
+SRC=""
+[ -f "$JSON" ] && SRC="$JSON"
+[ -z "$SRC" ] && [ -f "$LOG" ] && SRC="$LOG"
+[ -n "$SRC" ] || { echo "ERROR: no run json/log for $TOPIC/$ANGLE (runs/$ANGLE.json or runs/altimate.log)" >&2; exit 1; }
+log "mining $SRC -> $OUT"
+
+# Strip ANSI escape codes so the mined evidence is clean (formatted logs are full of them).
+CLEAN="$(mktemp)"; trap 'rm -f "$CLEAN"' EXIT
+perl -pe 's/\x1b\[[0-9;]*[a-zA-Z]//g' "$SRC" > "$CLEAN" 2>/dev/null || cp "$SRC" "$CLEAN"
+SRC="$CLEAN"
+
+# Tool/command sequence the agent executed (bash tool-calls + tool names).
+extract_tools() {  # extract_tools <file>
+  grep -oE '"(tool|name)"[[:space:]]*:[[:space:]]*"[^"]+"' "$1" 2>/dev/null | sed -E 's/.*"([^"]+)"$/\1/' \
+    | grep -viE '^(text|message|assistant|user|step)$' || true
+}
+# Shell commands captured in the recorded output (works on the formatted log too).
+extract_cmds() {  # extract_cmds <file>
+  grep -oE '(\$ |bash |python |diff |altimate-code |duckdb |run_sql)[^"]*' "$1" 2>/dev/null \
+    | sed -E 's/\\u[0-9a-fA-F]{4}//g' | grep -vE '^\s*$' | head -40 || true
+}
+# Verification / value-moment markers — evidence the valuable thing happened on camera.
+VALUE_MARKERS='run_sql|diff |byte-for-byte|identical|zero diff|verify|expected_sql|equival|sql_analyze|altimate_core|\bcheck\b|lineage|cost|row count|assert'
+
+TOOLS="$(extract_tools "$SRC" | sort | uniq -c | sort -rn | head -25)"
+CMDS="$(extract_cmds "$SRC")"
+VALUE_HITS="$(grep -niE "$VALUE_MARKERS" "$SRC" 2>/dev/null | sed -E 's/\\u[0-9a-fA-F]{4}//g' | head -25)"
+TRACE_ID="$(grep -oE 'ses_[A-Za-z0-9]+' "$SRC" 2>/dev/null | head -1)"
+
+{
+  echo "# Learnings — $TOPIC / $ANGLE"
+  echo
+  echo "_Mined from \`$(basename "$SRC")\`$( [ -n "$TRACE_ID" ] && echo " · trace \`$TRACE_ID\`")._"
+  echo "_All items below are extracted from the REAL run. The 'why it matters' and reusable"
+  echo "lessons are filled in by the author after reading this — do not invent value the run"
+  echo "did not show (law #2)._"
+  echo
+  echo "## What altimate-code actually ran"
+  echo '```'
+  [ -n "$CMDS" ] && echo "$CMDS" || echo "(no shell commands parsed — inspect $SRC directly)"
+  echo '```'
+  echo
+  echo "## Tools / events observed (count)"
+  echo '```'
+  [ -n "$TOOLS" ] && echo "$TOOLS" || echo "(none parsed)"
+  echo '```'
+  echo
+  echo "## Value-moment evidence (verification / capability markers)"
+  if [ -n "$VALUE_HITS" ]; then echo '```'; echo "$VALUE_HITS"; echo '```'; else
+    echo "**NONE FOUND.** The run shows no verification/capability marker — per law #2 you"
+    echo "do not yet have a value moment for this angle. Change the task or report honestly."
+  fi
+  echo
+  echo "## Why it matters  <!-- author fills in: one line, grounded in the evidence above -->"
+  echo
+  echo "## Reusable lesson  <!-- author: product bug / env gotcha / prompt that triggers the value / fixture pattern."
+  echo "                         If it should persist across sessions, also save to memory + MANIFEST ledger. -->"
+} > "$OUT"
+
+if [ "$WITH_BASE" = "1" ] && [ -f "$DD/runs/$ANGLE.baseline.json" ]; then
+  {
+    echo
+    echo "## Baseline (claude-code) — value markers, for the contrast"
+    BH="$(grep -niE "$VALUE_MARKERS" "$DD/runs/$ANGLE.baseline.json" 2>/dev/null | head -20)"
+    if [ -n "$BH" ]; then echo '```'; echo "$BH"; echo '```'; else
+      echo "(baseline shows none of: $VALUE_MARKERS — this absence IS the contrast.)"
+    fi
+  } >> "$OUT"
+fi
+
+log "wrote $OUT"
+echo "$OUT"
diff --git a/.claude/skills/demo-creator/scripts/inspect.sh b/.claude/skills/demo-creator/scripts/inspect.sh
new file mode 100755
index 000000000..0717dcc5c
--- /dev/null
+++ b/.claude/skills/demo-creator/scripts/inspect.sh
@@ -0,0 +1,43 @@
+#!/usr/bin/env bash
+# Extract PNG frames from a clip's GIF for visual inspection, and print the authenticity
+# cross-check inputs (json event summary + how to view the trace).
+# Usage: inspect.sh <topic> <angle> [--fps N] [--prefix NAME]
+set -euo pipefail
+DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"; source "$DIR/lib.sh"
+
+[ $# -ge 2 ] || { echo "usage: inspect.sh <topic> <angle> [--fps N] [--prefix NAME]" >&2; exit 2; }
+TOPIC="$1"; ANGLE="$2"; shift 2
+FPS=1; PREFIX="$ANGLE"
+while [ $# -gt 0 ]; do
+  case "$1" in
+    --fps) FPS="$2"; shift 2;;
+    --prefix) PREFIX="$2"; shift 2;;
+    *) echo "unknown opt: $1" >&2; exit 2;;
+  esac
+done
+need ffmpeg ffmpeg
+
+DD="$(demo_dir "$TOPIC")"
+GIF="$DD/clips/$PREFIX.gif"
+FRAMES="$DD/clips/frames/$PREFIX"
+[ -f "$GIF" ] || { echo "ERROR: no GIF at $GIF (record it first)" >&2; exit 1; }
+mkdir -p "$FRAMES"; rm -f "$FRAMES"/f_*.png
+
+log "extracting frames @ ${FPS}fps -> $FRAMES"
+ffmpeg -y -loglevel error -i "$GIF" -vf "fps=$FPS" "$FRAMES/f_%03d.png"
+COUNT=$(ls "$FRAMES"/f_*.png 2>/dev/null | wc -l | tr -d ' ')
+log "wrote $COUNT frames. Read them to verify legibility + the value moment."
+
+# --- Authenticity cross-check pointers ---
+JSON="$DD/runs/$ANGLE.json"
+if [ -f "$JSON" ]; then
+  log "json event stream present: $JSON"
+  echo "----- tool calls observed in the run (cross-check against the clip) -----"
+  # Best-effort: pull tool/event names out of the json stream (works for line-delimited json).
+  grep -oE '"(tool|name|type)"[[:space:]]*:[[:space:]]*"[^"]+"' "$JSON" 2>/dev/null | sort | uniq -c | sort -rn | head -40 || true
+  echo "------------------------------------------------------------------------"
+else
+  log "NOTE: no $JSON — run baseline_vs_altimate.sh (or run with --format json) to get the authenticity proof."
+fi
+echo "To view the recorded session trace:  altimate-code trace list   then   altimate-code trace view <id>"
+echo "FRAMES_DIR=$FRAMES"
diff --git a/.claude/skills/demo-creator/scripts/lib.sh b/.claude/skills/demo-creator/scripts/lib.sh
new file mode 100755
index 000000000..dfbe228bf
--- /dev/null
+++ b/.claude/skills/demo-creator/scripts/lib.sh
@@ -0,0 +1,20 @@
+#!/usr/bin/env bash
+# Shared helpers for demo-creator scripts. Source this; don't run directly.
+set -euo pipefail
+
+# Repo root = three levels up from .claude/skills/demo-creator/scripts/
+SKILL_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
+REPO_ROOT="$(cd "$SKILL_DIR/../../.." && pwd)"
+
+demo_dir() {   # demo_dir <topic-slug>
+  echo "$REPO_ROOT/demos/$1"
+}
+
+need() {       # need <cmd> [brew-pkg]
+  if ! command -v "$1" >/dev/null 2>&1; then
+    echo "ERROR: '$1' not found. Install: brew install ${2:-$1}" >&2
+    return 1
+  fi
+}
+
+log() { printf '\033[36m[demo-creator]\033[0m %s\n' "$*" >&2; }
diff --git a/.claude/skills/demo-creator/scripts/record.sh b/.claude/skills/demo-creator/scripts/record.sh
new file mode 100755
index 000000000..53d2dc62b
--- /dev/null
+++ b/.claude/skills/demo-creator/scripts/record.sh
@@ -0,0 +1,56 @@
+#!/usr/bin/env bash
+# Record a REAL run with asciinema, then render that exact cast to a GIF with agg.
+# Usage:
+#   record.sh <topic-slug> <angle> [--cwd DIR] [--cols N] [--rows N] [--font N] \
+#             [--idle SECS] [--out-prefix NAME] -- <command...>
+# Everything after `--` is the real command that gets recorded (e.g. altimate-code run ...).
+# Produces: demos/<topic>/clips/<out-prefix>.cast and .gif
+set -euo pipefail
+DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"; source "$DIR/lib.sh"
+
+[ $# -ge 2 ] || { echo "usage: record.sh <topic> <angle> [opts] -- <command...>" >&2; exit 2; }
+TOPIC="$1"; ANGLE="$2"; shift 2
+CWD="$PWD"; COLS=100; ROWS=30; FONT=22; IDLE=2; PREFIX="$ANGLE"; BANNER=""
+while [ $# -gt 0 ]; do
+  case "$1" in
+    --cwd) CWD="$2"; shift 2;;
+    --cols) COLS="$2"; shift 2;;
+    --rows) ROWS="$2"; shift 2;;
+    --font) FONT="$2"; shift 2;;
+    --idle) IDLE="$2"; shift 2;;
+    --out-prefix) PREFIX="$2"; shift 2;;
+    --banner) BANNER="$2"; shift 2;;   # shown on screen before the run (headless hides the invocation)
+    --) shift; break;;
+    *) echo "unknown opt: $1" >&2; exit 2;;
+  esac
+done
+[ $# -ge 1 ] || { echo "ERROR: no command after --" >&2; exit 2; }
+
+need asciinema asciinema; need agg agg
+
+DD="$(demo_dir "$TOPIC")"; CLIPS="$DD/clips"; mkdir -p "$CLIPS"
+CAST="$CLIPS/$PREFIX.cast"; GIF="$CLIPS/$PREFIX.gif"
+
+# Build a single shell string for the real command (asciinema --command takes one string).
+CMD="$(printf '%q ' "$@")"
+log "recording REAL run -> $CAST"
+log "  cwd: $CWD"
+log "  cmd: $CMD"
+
+# Optional on-screen banner so viewers see what was asked (headless mode hides the prompt).
+# The real command is unchanged — the banner only echoes context, it does not fake output.
+INNER="$CMD"
+if [ -n "$BANNER" ]; then
+  INNER="printf '\033[1;36m\$ %s\033[0m\n\n' $(printf '%q' "$BANNER"); sleep 1; $CMD"
+fi
+
+# Record. asciinema (v3) runs the command in a real pty and stops when it exits.
+( cd "$CWD" && asciinema rec --overwrite --window-size "${COLS}x${ROWS}" \
+    --command "bash -lc $(printf '%q' "$INNER")" "$CAST" )
+
+log "rendering -> $GIF"
+agg --cols "$COLS" --rows "$ROWS" --font-size "$FONT" --idle-time-limit "$IDLE" \
+    "$CAST" "$GIF"
+
+log "done: $CAST  +  $GIF"
+echo "$GIF"
diff --git a/.claude/skills/demo-creator/scripts/setup_pyspark_fixture.sh b/.claude/skills/demo-creator/scripts/setup_pyspark_fixture.sh
new file mode 100755
index 000000000..da47530dd
--- /dev/null
+++ b/.claude/skills/demo-creator/scripts/setup_pyspark_fixture.sh
@@ -0,0 +1,62 @@
+#!/usr/bin/env bash
+# Bootstrap a REAL local PySpark environment (local[*], no cluster) for a demo fixture.
+# Idempotent. Prints the env exports the demo run needs.
+# Usage: setup_pyspark_fixture.sh <fixture-dir>
+set -euo pipefail
+DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"; source "$DIR/lib.sh"
+
+FIXTURE="${1:?usage: setup_pyspark_fixture.sh <fixture-dir>}"
+mkdir -p "$FIXTURE"; FIXTURE="$(cd "$FIXTURE" && pwd)"
+log "fixture dir: $FIXTURE"
+
+# --- JDK (Spark needs a JVM). We pin PySpark to the 3.5 line, which supports Java 8/11/17.
+#     Many brew `openjdk@N` aliases actually resolve to the latest JDK (e.g. 23), which Spark
+#     rejects, so we VALIDATE the reported major version and accept only 8/11/17 (then 21). ---
+java_major() { "$1/bin/java" -version 2>&1 | sed -nE 's/.*version "([0-9]+).*/\1/p' | head -1; }
+JAVA_HOME=""
+for cand in \
+  "/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home" \
+  "/opt/homebrew/opt/openjdk@11/libexec/openjdk.jdk/Contents/Home" \
+  "/opt/homebrew/opt/openjdk@21/libexec/openjdk.jdk/Contents/Home"; do
+  [ -x "$cand/bin/java" ] || continue
+  m="$(java_major "$cand")"
+  case "$m" in 8|11|17|21) JAVA_HOME="$cand"; log "using JDK $m at $cand"; break;; esac
+done
+if [ -z "$JAVA_HOME" ]; then
+  log "no Spark-3.5-compatible JDK (8/11/17/21) found; installing openjdk@17"
+  brew install openjdk@17
+  JAVA_HOME="/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home"
+fi
+export JAVA_HOME
+log "JAVA_HOME=$JAVA_HOME ($(java_major "$JAVA_HOME"))"
+
+# --- Python venv + pyspark (3.5 line) via uv (fast). Fallback to python3 -m venv. ---
+VENV="$FIXTURE/.venv"
+PYSPARK_SPEC='pyspark>=3.5,<4'
+if command -v uv >/dev/null 2>&1; then
+  ( cd "$FIXTURE" && uv venv "$VENV" >/dev/null 2>&1 || true; \
+    VIRTUAL_ENV="$VENV" uv pip install --python "$VENV/bin/python" "$PYSPARK_SPEC" )
+else
+  python3 -m venv "$VENV"; "$VENV/bin/pip" install -q --upgrade pip "$PYSPARK_SPEC"
+fi
+log "pyspark installed in $VENV"
+
+# --- Smoke test: prove Spark actually runs here. ---
+log "smoke-testing Spark local[*]..."
+JAVA_HOME="$JAVA_HOME" "$VENV/bin/python" - <<'PY'
+from pyspark.sql import SparkSession
+s = SparkSession.builder.master("local[*]").appName("smoke").getOrCreate()
+n = s.range(5).count()
+s.stop()
+assert n == 5, n
+print("SPARK_SMOKE_OK rows=%d" % n)
+PY
+
+cat <<EOF
+
+# ---- add these to your shell / the demo run env ----
+export JAVA_HOME="$JAVA_HOME"
+export PATH="\$JAVA_HOME/bin:$VENV/bin:\$PATH"
+# run pipeline:  $VENV/bin/python <script>.py
+EOF
+log "fixture env ready."
diff --git a/.claude/skills/demo-creator/templates/blog.md.tmpl b/.claude/skills/demo-creator/templates/blog.md.tmpl
new file mode 100644
index 000000000..14715cb81
--- /dev/null
+++ b/.claude/skills/demo-creator/templates/blog.md.tmpl
@@ -0,0 +1,52 @@
+# {{TITLE — state the surprising result, not the topic}}
+
+> {{One-line deck: the claim you proved, in plain words.}}
+
+{{HOOK: open on the real pain. Lead with a verbatim quote from research.md and its source.
+A data engineer on r/dataengineering put it bluntly:}}
+
+> "{{REAL_QUOTE}}"
+> — {{source, linked: URL}}
+
+{{2–4 sentences: why this pain is common and what it actually costs. No "in today's
+landscape". Get to the point.}}
+
+## The setup
+
+{{Describe the real fixture in 2–3 sentences. Link it. Make clear it's a genuine task with
+a real defect/decision, not a toy with the answer pre-written.}}
+
+## Baseline: plain Claude Code
+
+{{What happened when we ran the same prompt through claude-code (`claude -p`), same model.
+Embed the baseline clip. Be fair — show what it got right too.}}
+
+![baseline]({{clips/<angle>.baseline.gif}})
+
+{{The specific thing it missed / got wrong, stated concretely.}}
+
+## With altimate-code
+
+![altimate-code]({{clips/<angle>.gif}})
+
+{{The value moment, in prose. Name the EXACT capability — e.g. "altimate_core_rewrite with
+verify_equivalence proved the rewrite returns identical rows" — not "the AI figured it
+out". One quotable stat or coined phrase here.}}
+
+[Replayable terminal cast]({{clips/<angle>.cast}}) · [session trace id: {{TRACE_ID}}]
+
+## Why it worked (and where it wouldn't)
+
+{{The mechanism, briefly. Then the honest limits: where altimate-code wouldn't help, what
+the baseline got right, what you'd still check yourself.}}
+
+## Replicate it yourself
+
+{{Exact, copy-pasteable steps — must match REPLICATE.md and the fixture scripts. A reader
+should reach the same result.}}
+
+```bash
+{{commands}}
+```
+
+## {{Discussable ending — a tradeoff or a question, not a summary.}}
diff --git a/.claude/skills/demo-creator/templates/demo.tape.tmpl b/.claude/skills/demo-creator/templates/demo.tape.tmpl
new file mode 100644
index 000000000..730cf4edf
--- /dev/null
+++ b/.claude/skills/demo-creator/templates/demo.tape.tmpl
@@ -0,0 +1,25 @@
+# vhs tape template (SCRIPTED FALLBACK only — primary recorder is asciinema+agg).
+# Replace {{PLACEHOLDERS}}. See references/recording.md.
+Require altimate-code
+
+Output {{OUT_GIF}}            # e.g. clips/schema-drift.gif
+Set Shell "bash"
+Set FontSize 22
+Set Width 1000
+Set Height 600
+Set Padding 16
+Set Theme "Dracula"
+
+# Establish the fixture state on screen (optional, keep short).
+Type "cd {{FIXTURE_DIR}}"  Enter
+Sleep 1s
+
+# The real run. --yolo so it completes unattended; --max-turns caps runaway.
+Type "altimate-code run --yolo --max-turns {{MAX_TURNS}} '{{PROMPT}}'"  Enter
+
+# Wait for a real completion marker rather than a fixed sleep.
+Wait@{{TIMEOUT}}s /{{DONE_REGEX}}/      # e.g. 240s  /VERIFIED|session ended|Done/
+
+# Freeze the value moment.
+Screenshot {{FRAMES_DIR}}/value-moment.png
+Sleep 3s
diff --git a/.claude/skills/demo-creator/templates/replicate.md.tmpl b/.claude/skills/demo-creator/templates/replicate.md.tmpl
new file mode 100644
index 000000000..6a90ee538
--- /dev/null
+++ b/.claude/skills/demo-creator/templates/replicate.md.tmpl
@@ -0,0 +1,36 @@
+# Replicate this demo yourself
+
+Everything here is real and reproducible. Estimated time: {{N}} minutes.
+
+## Prerequisites
+- altimate-code installed: `curl -fsSL {{INSTALL_URL}} | bash` (or `npm i -g altimate-code`)
+- {{Any topic-specific deps, e.g. a JDK + pyspark, or duckdb}}
+- An LLM provider configured: `altimate-code providers` (auth)
+
+## 1. Get the fixture
+```bash
+{{clone/copy the demos/<topic>/fixture, or the setup script}}
+bash {{setup_script}}
+```
+
+## 2. Reproduce the baseline (plain Claude Code)
+```bash
+{{the exact baseline command — e.g. claude -p --dangerously-skip-permissions --model <m> '<prompt>'}}
+```
+Expected: {{what claude-code does — the gap}}.
+
+## 3. Run altimate-code (capability on)
+```bash
+{{the exact altimate command}}
+```
+Expected: {{the value moment — what to look for}}.
+
+## 4. Verify it yourself
+```bash
+{{how to confirm the result is real — e.g. run the pipeline / query the output / altimate-code check}}
+altimate-code trace list        # your run is recorded; view it with `trace view <id>`
+```
+
+## Notes / gotchas
+- {{anything that commonly trips people up}}
+- Docker fallback (hermetic): {{command, if applicable}}

From 13772ba13d70b9d19135a34641531eb61ea6ddf3 Mon Sep 17 00:00:00 2001
From: anandgupta42 <anand@altimate.ai>
Date: Tue, 30 Jun 2026 17:43:42 -0700
Subject: [PATCH 2/2] fix: [demo-creator] address PR review comments

Harden the skill's scripts and docs per the PR #979 review (Codex, CodeRabbit,
Kilo, cubic):

- `lib.sh`: `demo_dir` now rejects non-slug topics (`..`, `/`) so paths can't
  escape `demos/`.
- `baseline_vs_altimate.sh`: guard empty-array expansion for bash 3.2
  (`${arr[@]+"${arr[@]}"}`); fix `OPENCODE_PERMISSION` to top-level tool keys
  (`{"bash":"deny"}`); warn loudly when `--reset-from` or `--model` is omitted
  (contaminated fixture / mismatched model invalidate the comparison); fail the
  script when the altimate run fails instead of `|| true`.
- `record.sh`: add `asciinema --return` and propagate a failed recorded command
  so a broken run isn't rendered and passed off as a good clip.
- `build_artifact.sh`: fix path-containment (realpath + `os.path.commonpath`,
  not string prefix); discover a Chrome/Chromium on PATH and hard-fail `--render`
  if none (render-inspection is mandatory).
- `inspect.sh`: cross-check a `.baseline` clip against the baseline JSON, and
  count frames with `find` so a no-match doesn't trip `set -e`.
- `setup_pyspark_fixture.sh`: honor an existing `JAVA_HOME`/`java`/`java_home`
  before Homebrew (works on Linux / Intel mac); robust major-version parse for
  `1.8`-style strings; validate prerequisites.
- `extract_learnings.sh`: display the real source filename, not the temp copy.
- `SKILL.md`: authenticity must come from the SAME recorded run (its `--trace`),
  not a second `--format json` run; add a language tag to the tree code block.
- `blog.md.tmpl`: mark the baseline section comparison-only (showcase is default).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .claude/skills/demo-creator/SKILL.md          | 17 +++--
 .../scripts/baseline_vs_altimate.sh           | 30 +++++++--
 .../demo-creator/scripts/build_artifact.sh    | 34 +++++++---
 .../demo-creator/scripts/extract_learnings.sh | 11 +++-
 .../skills/demo-creator/scripts/inspect.sh    | 10 ++-
 .claude/skills/demo-creator/scripts/lib.sh    |  8 ++-
 .claude/skills/demo-creator/scripts/record.sh | 16 ++++-
 .../scripts/setup_pyspark_fixture.sh          | 65 ++++++++++++++-----
 .../demo-creator/templates/blog.md.tmpl       |  5 +-
 9 files changed, 148 insertions(+), 48 deletions(-)

diff --git a/.claude/skills/demo-creator/SKILL.md b/.claude/skills/demo-creator/SKILL.md
index c4898f3e4..40d6584b5 100644
--- a/.claude/skills/demo-creator/SKILL.md
+++ b/.claude/skills/demo-creator/SKILL.md
@@ -89,7 +89,7 @@ software behind it. Those don't need a recorded real run.
 
 Create `demos/<topic-slug>/` at the repo root:
 
-```
+```text
 demos/<topic-slug>/
   research.md          # phase 1: real pains + verbatim quotes + sources
   demo-spec.md         # phase 3: angles, fixtures, prompts, baseline controls
@@ -152,14 +152,19 @@ advance until it passes.
 ### Phase 5 — Record (real runs)
 - **Showcase mode:** `scripts/record.sh <topic> <angle> --cwd <fixture> --banner '<prompt>'
   -- altimate-code run --yolo --trace '<prompt>'` records the real run with **asciinema** →
-  `.cast`, then renders to `.gif` with **agg**. Capture the json too:
-  `altimate-code run --format json … > runs/<angle>.json` (authenticity proof). `vhs` is the
-  optional scripted-polish fallback (see `references/recording.md`).
+  `.cast`, then renders to `.gif` with **agg**. `vhs` is the optional scripted-polish
+  fallback (see `references/recording.md`).
+- **Authenticity comes from the SAME recorded run, not a re-run.** The clip's proof is that
+  `.cast` PLUS the `--trace` written by *that* invocation (grab its trace id from the cast /
+  `altimate-code trace list`). Do NOT run `altimate-code run --format json` a second time to
+  "get the json" — a fresh agent run is nondeterministic and would authenticate a different
+  session than the one on screen. (In comparison mode, `baseline_vs_altimate.sh` captures
+  `--format json` from the very run it records, which is fine — one run, one artifact.)
 - **Comparison mode (optional):** also run `scripts/baseline_vs_altimate.sh <topic> <angle>
   --cwd <fixture> --prompt '...' --reset-from <pristine>` to capture the `claude -p` control
   on the same prompt/model, and record its clip too.
-- **Gate:** `.cast` + `.gif` + `runs/<angle>.json` exist for the altimate run (plus the
-  baseline variant in comparison mode).
+- **Gate:** `.cast` + `.gif` (+ the recorded run's trace id) exist for the altimate run (plus
+  the baseline variant in comparison mode).
 
 ### Phase 6 — Visual inspection (use your eyes)
 - `scripts/inspect.sh <topic> <angle>` extracts key PNG frames. **Read them** and confirm:
diff --git a/.claude/skills/demo-creator/scripts/baseline_vs_altimate.sh b/.claude/skills/demo-creator/scripts/baseline_vs_altimate.sh
index fc67ed85c..722721586 100755
--- a/.claude/skills/demo-creator/scripts/baseline_vs_altimate.sh
+++ b/.claude/skills/demo-creator/scripts/baseline_vs_altimate.sh
@@ -40,6 +40,19 @@ need altimate-code
 
 DD="$(demo_dir "$TOPIC")"; RUNS="$DD/runs"; mkdir -p "$RUNS"
 
+# The comparison is only fair if BOTH runs start from the identical fixture. Without a
+# pristine reset, the baseline run's edits leak into the altimate run's starting state.
+if [ -z "$RESET_FROM" ]; then
+  log "WARNING: no --reset-from given. The baseline run can mutate $CWD, so the altimate run"
+  log "         would start from contaminated state. Pass --reset-from <pristine> for a valid comparison."
+fi
+# Model control: baseline defaults to '$BASE_MODEL'; if altimate --model is unset it uses the
+# configured default, which may differ. Warn so a win/loss can't come from model selection.
+if [ -z "$MODEL" ]; then
+  log "WARNING: no --model for the altimate run; it will use the configured default. Pass --model"
+  log "         so it matches --baseline-model ('$BASE_MODEL') and the only variable is altimate-code."
+fi
+
 reset_fixture() {
   [ -n "$RESET_FROM" ] || return 0
   [ -d "$RESET_FROM" ] || { echo "ERROR: --reset-from '$RESET_FROM' is not a dir" >&2; return 1; }
@@ -48,6 +61,7 @@ reset_fixture() {
 }
 
 # ---------------- baseline ----------------
+# The baseline is EXPECTED to sometimes struggle, so its failure doesn't fail the script.
 log "=== BASELINE run ($BASE_MODE) ==="
 reset_fixture
 if [ "$BASE_MODE" = "claude" ]; then
@@ -56,24 +70,30 @@ if [ "$BASE_MODE" = "claude" ]; then
       --output-format json --model "$BASE_MODEL" "$PROMPT" ) \
       | tee "$RUNS/$ANGLE.baseline.json" || true
 else
-  # In-fork isolation: run altimate-code but strip the capability under test.
+  # In-fork isolation: run altimate-code but strip the capability under test. Permission keys
+  # are top-level tool names (see config.ts permission schema), e.g. {"bash":"deny"}.
   B_ARGS=(); [ -n "$BASE_AGENT" ] && B_ARGS+=(--agent "$BASE_AGENT")
   M_ARGS=(); [ -n "$MODEL" ] && M_ARGS+=(--model "$MODEL")
   if [ -n "$BASE_DENY" ]; then
-    export OPENCODE_PERMISSION="{\"tools\":{$(echo "$BASE_DENY" | awk -F, '{for(i=1;i<=NF;i++){printf "%s\"%s\":\"deny\"", (i>1?",":""), $i}}')}}"
+    export OPENCODE_PERMISSION="{$(echo "$BASE_DENY" | awk -F, '{for(i=1;i<=NF;i++){printf "%s\"%s\":\"deny\"", (i>1?",":""), $i}}')}"
     log "  baseline OPENCODE_PERMISSION=$OPENCODE_PERMISSION"
   fi
   ( cd "$CWD" && altimate-code run --yolo --max-turns "$MAXT" --format json --trace \
-      "${M_ARGS[@]}" "${B_ARGS[@]}" "$PROMPT" ) | tee "$RUNS/$ANGLE.baseline.json" || true
+      ${M_ARGS[@]+"${M_ARGS[@]}"} ${B_ARGS[@]+"${B_ARGS[@]}"} "$PROMPT" ) | tee "$RUNS/$ANGLE.baseline.json" || true
   unset OPENCODE_PERMISSION
 fi
 
 # ---------------- altimate ----------------
+# This run is the evidence — if it doesn't complete, the comparison is invalid, so fail loudly.
 log "=== ALTIMATE run ==="
 reset_fixture
 M_ARGS=(); [ -n "$MODEL" ] && M_ARGS+=(--model "$MODEL")
-( cd "$CWD" && altimate-code run --yolo --max-turns "$MAXT" --format json --trace \
-    "${M_ARGS[@]}" "$PROMPT" ) | tee "$RUNS/$ANGLE.json" || true
+set -o pipefail
+if ! ( cd "$CWD" && altimate-code run --yolo --max-turns "$MAXT" --format json --trace \
+    ${M_ARGS[@]+"${M_ARGS[@]}"} "$PROMPT" ) | tee "$RUNS/$ANGLE.json"; then
+  echo "ERROR: the altimate-code run failed — comparison evidence is invalid. Not reporting success." >&2
+  exit 1
+fi
 
 log "json captured: $RUNS/$ANGLE.baseline.json  +  $RUNS/$ANGLE.json"
 log "fair-comparison check: baseline model='$BASE_MODEL', altimate model='${MODEL:-<default>}' — keep these equivalent."
diff --git a/.claude/skills/demo-creator/scripts/build_artifact.sh b/.claude/skills/demo-creator/scripts/build_artifact.sh
index 23e90ff68..6fcb46b6d 100755
--- a/.claude/skills/demo-creator/scripts/build_artifact.sh
+++ b/.claude/skills/demo-creator/scripts/build_artifact.sh
@@ -31,9 +31,12 @@ tpl, base, out = sys.argv[1:4]
 html = open(tpl, encoding="utf-8").read()
 mimetypes.add_type("image/svg+xml", ".svg")
 inlined = [0]; missing = []
+base_real = os.path.realpath(base)
 def datauri(rel):
-    p = os.path.normpath(os.path.join(base, rel))
-    if not (p.startswith(os.path.abspath(base)) and os.path.isfile(p)):
+    # realpath resolves symlinks; commonpath confines to base (a plain startswith prefix
+    # would let a sibling dir sharing base's name — e.g. base/../base-secret — escape).
+    p = os.path.realpath(os.path.join(base, rel))
+    if not (os.path.isfile(p) and os.path.commonpath([p, base_real]) == base_real):
         missing.append(rel); return None
     mime = mimetypes.guess_type(p)[0] or "application/octet-stream"
     inlined[0] += 1
@@ -52,16 +55,27 @@ if missing: print("WARNING: could not inline (will break under CSP): %s" % ", ".
 PY
 
 if [ "$RENDER" = "1" ]; then
-  CHROME="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
+  # Discover a Chromium-family browser on PATH or in the usual macOS bundle locations.
+  CHROME=""
+  for c in "${CHROME_BIN:-}" google-chrome google-chrome-stable chromium chromium-browser chrome \
+           "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
+           "/Applications/Chromium.app/Contents/MacOS/Chromium" \
+           "/Applications/Microsoft Edge.app/Contents/MacOS/Microsoft Edge"; do
+    [ -n "$c" ] || continue
+    if command -v "$c" >/dev/null 2>&1; then CHROME="$(command -v "$c")"; break; fi
+    [ -x "$c" ] && { CHROME="$c"; break; }
+  done
   SHOT="$DD/clips/frames/_artifact_render.png"; mkdir -p "$(dirname "$SHOT")"
-  if [ -x "$CHROME" ]; then
-    "$CHROME" --headless=new --disable-gpu --hide-scrollbars --force-device-scale-factor=1 \
-      --window-size="$W,$H" --screenshot="$SHOT" "file://$OUTPATH" 2>/dev/null
-    log "rendered for inspection -> $SHOT  (Read it; check empty grid cells, mostly-empty GIFs, no h-scroll)"
-    echo "$SHOT"
-  else
-    log "Chrome not found; render skipped. Open $OUTPATH manually to inspect before publishing."
+  if [ -z "$CHROME" ]; then
+    # Render-inspection is mandatory before publishing, so a missing browser is a hard error.
+    echo "ERROR: --render requested but no Chrome/Chromium found. Install one or set CHROME_BIN." >&2
+    echo "       (Do NOT publish an artifact you haven't visually inspected.)" >&2
+    exit 1
   fi
+  "$CHROME" --headless=new --disable-gpu --hide-scrollbars --force-device-scale-factor=1 \
+    --window-size="$W,$H" --screenshot="$SHOT" "file://$OUTPATH" 2>/dev/null
+  log "rendered for inspection -> $SHOT  (Read it; check empty grid cells, mostly-empty GIFs, no h-scroll)"
+  echo "$SHOT"
 fi
 log "artifact ready: $OUTPATH  (publish with the Artifact tool; redeploy to the same path on edits)"
 echo "$OUTPATH"
diff --git a/.claude/skills/demo-creator/scripts/extract_learnings.sh b/.claude/skills/demo-creator/scripts/extract_learnings.sh
index 0f33c5099..432db8dea 100755
--- a/.claude/skills/demo-creator/scripts/extract_learnings.sh
+++ b/.claude/skills/demo-creator/scripts/extract_learnings.sh
@@ -22,13 +22,18 @@ LOG="$DD/runs/$ANGLE.log"; [ -f "$LOG" ] || LOG="$DD/runs/altimate.log"
 OUT="$DD/learnings.md"
 
 # Pull the source the run actually produced. Prefer json; fall back to the captured log.
+CAST="$DD/clips/$ANGLE.cast"
 SRC=""
 [ -f "$JSON" ] && SRC="$JSON"
 [ -z "$SRC" ] && [ -f "$LOG" ] && SRC="$LOG"
-[ -n "$SRC" ] || { echo "ERROR: no run json/log for $TOPIC/$ANGLE (runs/$ANGLE.json or runs/altimate.log)" >&2; exit 1; }
+# Fall back to the recorded asciinema cast — it IS the run record (JSONL; output text is greppable).
+[ -z "$SRC" ] && [ -f "$CAST" ] && SRC="$CAST"
+[ -n "$SRC" ] || { echo "ERROR: no run source for $TOPIC/$ANGLE (runs/$ANGLE.json, runs/altimate.log, or clips/$ANGLE.cast)" >&2; exit 1; }
 log "mining $SRC -> $OUT"
 
-# Strip ANSI escape codes so the mined evidence is clean (formatted logs are full of them).
+# Remember the real source for display; mine from an ANSI-stripped copy (formatted logs are
+# full of escape codes). SRC below points at the temp copy, SRC_NAME at the real file.
+SRC_NAME="$(basename "$SRC")"
 CLEAN="$(mktemp)"; trap 'rm -f "$CLEAN"' EXIT
 perl -pe 's/\x1b\[[0-9;]*[a-zA-Z]//g' "$SRC" > "$CLEAN" 2>/dev/null || cp "$SRC" "$CLEAN"
 SRC="$CLEAN"
@@ -54,7 +59,7 @@ TRACE_ID="$(grep -oE 'ses_[A-Za-z0-9]+' "$SRC" 2>/dev/null | head -1)"
 {
   echo "# Learnings — $TOPIC / $ANGLE"
   echo
-  echo "_Mined from \`$(basename "$SRC")\`$( [ -n "$TRACE_ID" ] && echo " · trace \`$TRACE_ID\`")._"
+  echo "_Mined from \`$SRC_NAME\`$( [ -n "$TRACE_ID" ] && echo " · trace \`$TRACE_ID\`")._"
   echo "_All items below are extracted from the REAL run. The 'why it matters' and reusable"
   echo "lessons are filled in by the author after reading this — do not invent value the run"
   echo "did not show (law #2)._"
diff --git a/.claude/skills/demo-creator/scripts/inspect.sh b/.claude/skills/demo-creator/scripts/inspect.sh
index 0717dcc5c..a5a6e4f5f 100755
--- a/.claude/skills/demo-creator/scripts/inspect.sh
+++ b/.claude/skills/demo-creator/scripts/inspect.sh
@@ -25,11 +25,17 @@ mkdir -p "$FRAMES"; rm -f "$FRAMES"/f_*.png
 
 log "extracting frames @ ${FPS}fps -> $FRAMES"
 ffmpeg -y -loglevel error -i "$GIF" -vf "fps=$FPS" "$FRAMES/f_%03d.png"
-COUNT=$(ls "$FRAMES"/f_*.png 2>/dev/null | wc -l | tr -d ' ')
+# Count without letting a no-match ls abort us under `set -e`/pipefail (lib.sh sets both).
+COUNT=$(find "$FRAMES" -name 'f_*.png' 2>/dev/null | wc -l | tr -d ' ')
 log "wrote $COUNT frames. Read them to verify legibility + the value moment."
 
 # --- Authenticity cross-check pointers ---
-JSON="$DD/runs/$ANGLE.json"
+# Reconcile against the run that produced THIS clip: a baseline prefix (<angle>.baseline)
+# must be checked against the baseline json, not the altimate one.
+case "$PREFIX" in
+  *.baseline) JSON="$DD/runs/$ANGLE.baseline.json";;
+  *)          JSON="$DD/runs/$ANGLE.json";;
+esac
 if [ -f "$JSON" ]; then
   log "json event stream present: $JSON"
   echo "----- tool calls observed in the run (cross-check against the clip) -----"
diff --git a/.claude/skills/demo-creator/scripts/lib.sh b/.claude/skills/demo-creator/scripts/lib.sh
index dfbe228bf..a208f5077 100755
--- a/.claude/skills/demo-creator/scripts/lib.sh
+++ b/.claude/skills/demo-creator/scripts/lib.sh
@@ -6,7 +6,13 @@ set -euo pipefail
 SKILL_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
 REPO_ROOT="$(cd "$SKILL_DIR/../../.." && pwd)"
 
-demo_dir() {   # demo_dir <topic-slug>
+demo_dir() {   # demo_dir <topic-slug> — reject anything that isn't a plain slug so the
+               # path can't escape demos/ (a `..` or `/` in the slug would).
+  case "${1:-}" in
+    ''|*[!a-zA-Z0-9._-]*|*..*)
+      echo "demo_dir: invalid topic slug '${1:-}' (use letters, digits, '.', '_', '-')" >&2
+      return 1;;
+  esac
   echo "$REPO_ROOT/demos/$1"
 }
 
diff --git a/.claude/skills/demo-creator/scripts/record.sh b/.claude/skills/demo-creator/scripts/record.sh
index 53d2dc62b..ad4dbaf38 100755
--- a/.claude/skills/demo-creator/scripts/record.sh
+++ b/.claude/skills/demo-creator/scripts/record.sh
@@ -28,6 +28,10 @@ done
 
 need asciinema asciinema; need agg agg
 
+# asciinema (Rust) rejects args that aren't valid UTF-8 under a non-UTF-8 locale. Force one
+# so prompts/banners with non-ASCII (em dashes, arrows, accents) don't error out.
+export LC_ALL="${LC_ALL:-en_US.UTF-8}" LANG="${LANG:-en_US.UTF-8}"
+
 DD="$(demo_dir "$TOPIC")"; CLIPS="$DD/clips"; mkdir -p "$CLIPS"
 CAST="$CLIPS/$PREFIX.cast"; GIF="$CLIPS/$PREFIX.gif"
 
@@ -45,12 +49,20 @@ if [ -n "$BANNER" ]; then
 fi
 
 # Record. asciinema (v3) runs the command in a real pty and stops when it exits.
-( cd "$CWD" && asciinema rec --overwrite --window-size "${COLS}x${ROWS}" \
-    --command "bash -lc $(printf '%q' "$INNER")" "$CAST" )
+# --return makes asciinema exit with the recorded command's status (default is always 0),
+# so a failed `altimate-code run` doesn't get rendered and passed off as a good demo.
+rec_rc=0
+( cd "$CWD" && asciinema rec --overwrite --return --window-size "${COLS}x${ROWS}" \
+    --command "bash -lc $(printf '%q' "$INNER")" "$CAST" ) || rec_rc=$?
 
 log "rendering -> $GIF"
 agg --cols "$COLS" --rows "$ROWS" --font-size "$FONT" --idle-time-limit "$IDLE" \
     "$CAST" "$GIF"
 
+if [ "$rec_rc" -ne 0 ]; then
+  echo "ERROR: the recorded command exited $rec_rc — this clip captured a FAILED run." >&2
+  echo "       Cast+GIF kept for debugging ($CAST), but do NOT ship this clip. Fix and re-record." >&2
+  exit "$rec_rc"
+fi
 log "done: $CAST  +  $GIF"
 echo "$GIF"
diff --git a/.claude/skills/demo-creator/scripts/setup_pyspark_fixture.sh b/.claude/skills/demo-creator/scripts/setup_pyspark_fixture.sh
index da47530dd..c16cc8d7f 100755
--- a/.claude/skills/demo-creator/scripts/setup_pyspark_fixture.sh
+++ b/.claude/skills/demo-creator/scripts/setup_pyspark_fixture.sh
@@ -9,27 +9,56 @@ FIXTURE="${1:?usage: setup_pyspark_fixture.sh <fixture-dir>}"
 mkdir -p "$FIXTURE"; FIXTURE="$(cd "$FIXTURE" && pwd)"
 log "fixture dir: $FIXTURE"
 
-# --- JDK (Spark needs a JVM). We pin PySpark to the 3.5 line, which supports Java 8/11/17.
-#     Many brew `openjdk@N` aliases actually resolve to the latest JDK (e.g. 23), which Spark
-#     rejects, so we VALIDATE the reported major version and accept only 8/11/17 (then 21). ---
-java_major() { "$1/bin/java" -version 2>&1 | sed -nE 's/.*version "([0-9]+).*/\1/p' | head -1; }
-JAVA_HOME=""
-for cand in \
-  "/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home" \
-  "/opt/homebrew/opt/openjdk@11/libexec/openjdk.jdk/Contents/Home" \
-  "/opt/homebrew/opt/openjdk@21/libexec/openjdk.jdk/Contents/Home"; do
-  [ -x "$cand/bin/java" ] || continue
-  m="$(java_major "$cand")"
-  case "$m" in 8|11|17|21) JAVA_HOME="$cand"; log "using JDK $m at $cand"; break;; esac
-done
-if [ -z "$JAVA_HOME" ]; then
-  log "no Spark-3.5-compatible JDK (8/11/17/21) found; installing openjdk@17"
-  brew install openjdk@17
-  JAVA_HOME="/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home"
+# --- JDK (Spark needs a JVM). We pin PySpark to the 3.5 line, which supports Java 8/11/17
+#     (and 21). We check, in order: an existing JAVA_HOME, a `java` on PATH, `/usr/libexec/
+#     java_home` (macOS), then Homebrew kegs — accepting only a Spark-compatible major
+#     version. Homebrew install is the LAST resort, so this works on Linux / Intel mac /
+#     already-configured shells, not just Apple-Silicon Homebrew. ---
+# Handles both "1.8.0_xxx" (major 8) and "17.0.1" (major 17) version strings.
+java_major() {
+  "$1/bin/java" -version 2>&1 | sed -nE 's/.*version "([0-9]+)(\.([0-9]+))?.*/\1 \3/p' | head -1 \
+    | awk '{ if ($1==1) print $2; else print $1 }'
+}
+compat() { case "$1" in 8|11|17|21) return 0;; *) return 1;; esac; }
+
+JAVA_HOME_FOUND=""
+add_cand() {  # try a candidate home; keep it if java is executable AND version is compatible
+  [ -n "$JAVA_HOME_FOUND" ] && return 0
+  [ -n "${1:-}" ] && [ -x "$1/bin/java" ] || return 0
+  local m; m="$(java_major "$1")"
+  if compat "$m"; then JAVA_HOME_FOUND="$1"; log "using JDK $m at $1"; fi
+}
+# 1) caller's JAVA_HOME  2) java already on PATH  3) macOS java_home for 17/11/8
+add_cand "${JAVA_HOME:-}"
+if [ -z "$JAVA_HOME_FOUND" ] && command -v java >/dev/null 2>&1; then
+  add_cand "$(cd "$(dirname "$(command -v java)")/.." && pwd)"
+fi
+if [ -z "$JAVA_HOME_FOUND" ] && [ -x /usr/libexec/java_home ]; then
+  for v in 17 11 21 8; do add_cand "$(/usr/libexec/java_home -v "$v" 2>/dev/null)"; done
+fi
+# 4) Homebrew kegs (Apple Silicon + Intel prefixes)
+if [ -z "$JAVA_HOME_FOUND" ]; then
+  for pfx in /opt/homebrew/opt /usr/local/opt; do
+    for v in 17 11 21; do add_cand "$pfx/openjdk@$v/libexec/openjdk.jdk/Contents/Home"; done
+  done
 fi
-export JAVA_HOME
+# 5) last resort: install via Homebrew if available
+if [ -z "$JAVA_HOME_FOUND" ]; then
+  if command -v brew >/dev/null 2>&1; then
+    log "no Spark-compatible JDK (8/11/17/21) found; installing openjdk@17 via Homebrew"
+    brew install openjdk@17
+    add_cand "$(brew --prefix openjdk@17)/libexec/openjdk.jdk/Contents/Home"
+    add_cand "$(brew --prefix openjdk@17)"
+  fi
+fi
+[ -n "$JAVA_HOME_FOUND" ] || { echo "ERROR: no Spark-compatible JDK (Java 8/11/17/21) found. Install one and/or set JAVA_HOME." >&2; exit 1; }
+export JAVA_HOME="$JAVA_HOME_FOUND"
 log "JAVA_HOME=$JAVA_HOME ($(java_major "$JAVA_HOME"))"
 
+# Prerequisites for the venv step.
+command -v uv >/dev/null 2>&1 || command -v python3 >/dev/null 2>&1 || {
+  echo "ERROR: need 'uv' or 'python3' to create the pyspark venv." >&2; exit 1; }
+
 # --- Python venv + pyspark (3.5 line) via uv (fast). Fallback to python3 -m venv. ---
 VENV="$FIXTURE/.venv"
 PYSPARK_SPEC='pyspark>=3.5,<4'
diff --git a/.claude/skills/demo-creator/templates/blog.md.tmpl b/.claude/skills/demo-creator/templates/blog.md.tmpl
index 14715cb81..adc061566 100644
--- a/.claude/skills/demo-creator/templates/blog.md.tmpl
+++ b/.claude/skills/demo-creator/templates/blog.md.tmpl
@@ -16,6 +16,8 @@ landscape". Get to the point.}}
 {{Describe the real fixture in 2–3 sentences. Link it. Make clear it's a genuine task with
 a real defect/decision, not a toy with the answer pre-written.}}
 
+<!-- COMPARISON MODE ONLY — delete this whole section in showcase mode (the default), which
+     produces no baseline clip. Keep it only if you ran baseline_vs_altimate.sh. -->
 ## Baseline: plain Claude Code
 
 {{What happened when we ran the same prompt through claude-code (`claude -p`), same model.
@@ -24,8 +26,9 @@ Embed the baseline clip. Be fair — show what it got right too.}}
 ![baseline]({{clips/<angle>.baseline.gif}})
 
 {{The specific thing it missed / got wrong, stated concretely.}}
+<!-- END comparison-only section -->
 
-## With altimate-code
+## {{Showcase: "What altimate-code did" · Comparison: "With altimate-code"}}
 
 ![altimate-code]({{clips/<angle>.gif}})