diff --git a/CHANGELOG.md b/CHANGELOG.md index 7cf9ecd..1200826 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -18,6 +18,27 @@ Everything below is opt-in. Notes for existing consumers: - `StepError` messages are richer (multi-line, with artifact paths). If you parsed them, prefer the new structured `error.artifacts` field. - The timing manifest gained optional fields (`lang`, `capture`). Old kept work dirs still post-process fine. +## 0.10.0 — teardown safety & authoring state + +A second dogfooding pass (umami again, on 0.9.0) surfaced a cluster of setup/teardown lifecycle gaps and the ergonomic hole the new per-tutorial setup opened. Additive for normal use — existing tutorials and adapters render unchanged. One type-only caveat: `StepContext` gained a required `state` field, so if you construct a `StepContext` object literal yourself (e.g. a test harness) it now needs that field; code that merely *receives* `ctx` in a callback is unaffected. + +Bug fixes (teardown coverage): + +- **Setup-phase failures no longer leak.** A throw in `adapter.setup` or `tutorial.setup` now runs the full teardown chain (step `onTeardown` thunks → `tutorial.teardown` → `adapter.teardown`) before rethrowing, instead of only `browser.close()`. `ctx.onTeardown` registered inside a `setup()` now means what it looks like it means, on the path most likely to seed-then-throw (a flaky sign-in, a warm-up `goto` timeout) (#15). +- **`preview` no longer leaks the adapter seed.** It now runs the *full* teardown chain on every exit path, not just the step thunks. `preview` is the run-repeatedly iterate tool, so the previous behavior (clean up step data, leak the adapter seed) quietly filled a shared test DB. Teardown hooks should tolerate the partial, mid-tutorial state `preview` reaches (#16). +- **Partial contact sheet on failure.** A failed render with `--contact-sheet`/`contactSheet` now emits a partial sheet of the steps that completed plus the failure frame as the last cell, instead of nothing — the at-a-glance view you most want for a failing run (#20). +- The `ctx.onTeardown` callback return type is widened to `() => unknown | Promise` (the result is awaited and discarded), so value-returning cleanups like `() => Promise.all(...)` typecheck without a wrapper (#21). + +New (additive API): + +- **`ctx.state` — a typed, per-render state channel.** `adapter.setup`'s return value lands on `ctx.state`, which `tutorial.setup` and steps read — replacing the module-global + `!`-assertion handoff with something scoped to one render (parallel-safe). Steps can also stash a live-created id on it for their own `onTeardown`. Typed end-to-end via `TutorialAdapter` / `tutorial` / `step` (#17). +- **`tutorial-forge doctor --setup`** actually runs `adapter.setup` once and tears it down, catching the "reachable but pointed at the wrong database" class of failure — a green reachability check followed by a guaranteed sign-in failure — before you wait out a whole render. Exposed programmatically as `probeAdapterSetup(adapter)`. Off by default (it seeds + signs in for real) (#19). + +Docs: + +- The "Settling" section now warns that `settleUntil: 'networkidle'` races React `startTransition`-deferred Server Actions (the standard Next.js App Router mutation) and steers those to `waitFor` on the committed UI (#18). +- Adapters docs gain a `ctx.state` section and a teardown-coverage matrix spelling out which hooks run on each path — clean finish, step failure, setup failure, `preview`, `doctor --setup` (#23). + ## 0.9.0 — authoring loop Ergonomics from a real-world dogfooding pass (authoring tutorials for the umami app). Additive for normal use — existing tutorials and adapters render unchanged. One type-only caveat: `StepContext` gained a required `onTeardown` method, so if you construct a `StepContext` object literal yourself (e.g. in a test harness) it now needs that field; code that merely *receives* `ctx` in a callback is unaffected. diff --git a/ISSUE-ACTION-REPORT.md b/ISSUE-ACTION-REPORT.md index fa35779..b5d406e 100644 --- a/ISSUE-ACTION-REPORT.md +++ b/ISSUE-ACTION-REPORT.md @@ -1,210 +1,77 @@ # tutorial-forge — Issue Action Report -_Prepared 2026-06-13 against `main` (v0.8.0). Author: maintainer review._ +_Prepared 2026-06-13 against `main` (v0.9.0 → v0.10.0). Author: maintainer review._ -All eight open issues except #1 came out of a single real-world authoring session: -building the umami "steward how-to" tutorials (`run-an-event`, `send-a-broadcast`) on -v0.8.0. They are field notes from someone actually authoring a non-trivial tutorial, which -makes them unusually high-signal. #1 predates that session and is a feature proposal. +Issues #15–#21 came out of a second real-world dogfooding session: authoring the umami +"steward how-to" tutorials (`run-an-event`, `send-a-broadcast`) on **v0.9.0**, the release +that added per-tutorial setup/teardown (#8). #22–#23 are gaps filed during this review. +(The prior round — #8–#14, the authoring inner loop — shipped in 0.9.0; this report +supersedes that one.) --- ## Executive summary -The umami session exposed one dominant theme: **the authoring inner loop is slow and -blind.** An author writes steps, runs a full render, and then cannot easily tell whether -the result is correct without scrubbing video or hand-extracting frames with ffmpeg. -Concretely, the pain clusters into three groups: +The dominant theme: **the setup/teardown lifecycle #8 introduced has a correctness hole +and an ergonomic hole.** Six of seven issues cluster there. -1. **Slow, blind iteration (the core pain).** Re-recording all steps to tune step 11 of 13 - (#11), no way to see per-step output without ffmpeg (#9), and no feedback that a feature - (idleSpeedup) even did anything (#13). These are about closing the edit→verify loop. +- **Correctness:** teardown didn't run on the setup-failure path (#15) or fully on the + `preview` path (#16), so a shared test DB silently accumulated leaked rows. +- **Ergonomics:** there was no first-class channel for `tutorial.setup`/steps to read what + `adapter.setup` established, forcing a module-global + `!` handoff (#17). +- Plus a settling foot-gun in the standard Next.js mutation pattern (#18), a doctor blind + spot (#19), a missing failure-time diagnostic (#20), and a type papercut (#21). -2. **Choreography gaps on non-click interactions (the most visible output-quality bug).** - `selectOption` / `fill` / `check` on off-screen or native controls produce "visually - dead frames" — narration talking about a control the camera never moves to or scrolls - into view (#10). This directly degrades the rendered video, which is the product. - -3. **Setup/teardown and readiness ergonomics (workarounds the author had to invent).** - A single global adapter forced state hacks and data leaks across tutorials (#8); fixed - `settleMs` magic numbers stood in for a real "wait until settled" signal (#14); and - `doctor` didn't catch the single most common failure — the app not running (#12). - -`forge publish` (#1) is a separate, forward-looking distribution feature, not session -friction. - -A notable finding from reading the code: **#10 and #13 are already partly built**, which -changes their effort and framing (details below). #13 is essentially a one-line fix. - ---- - -## Per-issue assessment - -Importance: Critical / High / Medium / Low. Effort: S (hours) / M (a day or two) / L (multi-day). - -| # | Title (restated) | Importance | Effort | Notes / dependencies | -|---|---|---|---|---| -| 13 | Log idleSpeedup even when it compresses nothing | Medium | **S** | One-line fix; already 90% there. Quick win. | -| 12 | `doctor` probes adapter `baseURL` reachability | High | S–M | Needs doctor to load config (it currently doesn't). Big friction-per-effort. | -| 9 | Keep per-step screenshots on success + contact sheet | High | M | Reuses existing `--debug` screenshot path. Core to the verify loop. | -| 10 | Cursor/callout/auto-scroll for fill/select/check | High | M | **Partly done already** — see below. Real gap is auto-scroll. Output quality. | -| 11 | Render a step range / single-step preview | High | M–L | Hardest of the loop trio (state deps). Biggest time saver if solved. | -| 14 | Docs: settleMs vs waitFor + a network-idle settle | Medium | S (docs) / M (helper) | Split: docs are S; `settleUntil` helper is M. | -| 8 | Per-tutorial setup/teardown hooks + step teardown | High | M | Type + pipeline change; backward-compatible. Removes real workarounds. | -| 1 | `forge publish` to hosting (Vimeo first) | Low (now) | L | Net-new surface, external API auth, well-specced but premature. | - -### #13 — idleSpeedup logs nothing on a no-op · Medium · S -**Verified in code.** `packages/core/src/pipeline/post.ts` lines 85–98 only log inside -`if (segments.length > 0)`. When idleSpeedup runs but finds no span over the threshold, it -emits nothing — exactly the reported symptom. By contrast zoom always logs (line 132). Fix -is to add an `else` branch (`idle: no spans over ms`) and, ideally, normalize the -existing line to the issue's suggested phrasing. Trivial, high-confidence, no risk. **Best -first quick win.** - -### #12 — `doctor` doesn't check the app is up · High · S–M -**Verified.** `packages/cli/src/doctor.ts` checks node/ffmpeg/ffprobe/chromium/TTS env vars -but never loads the config or touches `adapter.baseURL`. "Forgot to start the dev server" is -the single most common real failure and currently surfaces as a Playwright navigation -timeout deep inside a render. Effort is S–M only because `doctor` must now load -`forge.config.ts` (the render path already does this via `cli/src/load.ts`, so it's -reusable). Keep it best-effort: skip cleanly if no config is resolvable, so `doctor` stays -useful outside a project. The issue's pnpm aside (`pnpm exec ... doctor` fails on an -ignored-builds pre-check) is a real, separate docs nit worth a one-liner. - -### #9 — Keep success screenshots + contact sheet · High · M -**Verified gap.** Screenshots are written only on failure (`captureFailure`) or in -`--debug` (`debugScreenshot`, `packages/core/src/pipeline/record.ts`). The infrastructure -already exists — the work is (a) optionally keep an end-of-step screenshot on success, and -(b) assemble a labeled contact-sheet PNG (ffmpeg `tile` or montage of the per-step shots). -This is the cheapest direct answer to "a passing render doesn't prove the right thing -rendered," which the author called their *entire verification loop*. Recommend `--contact- -sheet` opt-in (default-on risks slowing every render). Pairs naturally with #11. - -### #10 — Choreography for non-click interactions · High · M (reduced) -**Partially implemented already — re-scope before building.** In -`packages/core/src/browser/instrument.ts`: `ACTION_METHODS` already includes `fill`, -`check`, `uncheck`, `selectOption`, so the **cursor already travels** to those targets; and -`CLICK_METHODS` already includes `selectOption`/`check`/`uncheck`, so they **already get the -ring + pulse**. So the issue is partly stale. The genuine remaining gaps: - - **No auto-scroll-into-view anywhere.** `presentAction` reads `boundingBox()` but never - scrolls the element into frame — this is the actual umami `