Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,27 @@ Everything below is opt-in. Notes for existing consumers:
- `StepError` messages are richer (multi-line, with artifact paths). If you parsed them, prefer the new structured `error.artifacts` field.
- The timing manifest gained optional fields (`lang`, `capture`). Old kept work dirs still post-process fine.

## 0.10.0 — teardown safety & authoring state

A second dogfooding pass (umami again, on 0.9.0) surfaced a cluster of setup/teardown lifecycle gaps and the ergonomic hole the new per-tutorial setup opened. Additive for normal use — existing tutorials and adapters render unchanged. One type-only caveat: `StepContext` gained a required `state` field, so if you construct a `StepContext` object literal yourself (e.g. a test harness) it now needs that field; code that merely *receives* `ctx` in a callback is unaffected.

Bug fixes (teardown coverage):

- **Setup-phase failures no longer leak.** A throw in `adapter.setup` or `tutorial.setup` now runs the full teardown chain (step `onTeardown` thunks → `tutorial.teardown` → `adapter.teardown`) before rethrowing, instead of only `browser.close()`. `ctx.onTeardown` registered inside a `setup()` now means what it looks like it means, on the path most likely to seed-then-throw (a flaky sign-in, a warm-up `goto` timeout) (#15).
- **`preview` no longer leaks the adapter seed.** It now runs the *full* teardown chain on every exit path, not just the step thunks. `preview` is the run-repeatedly iterate tool, so the previous behavior (clean up step data, leak the adapter seed) quietly filled a shared test DB. Teardown hooks should tolerate the partial, mid-tutorial state `preview` reaches (#16).
- **Partial contact sheet on failure.** A failed render with `--contact-sheet`/`contactSheet` now emits a partial sheet of the steps that completed plus the failure frame as the last cell, instead of nothing — the at-a-glance view you most want for a failing run (#20).
- The `ctx.onTeardown` callback return type is widened to `() => unknown | Promise<unknown>` (the result is awaited and discarded), so value-returning cleanups like `() => Promise.all(...)` typecheck without a wrapper (#21).

New (additive API):

- **`ctx.state` — a typed, per-render state channel.** `adapter.setup`'s return value lands on `ctx.state`, which `tutorial.setup` and steps read — replacing the module-global + `!`-assertion handoff with something scoped to one render (parallel-safe). Steps can also stash a live-created id on it for their own `onTeardown`. Typed end-to-end via `TutorialAdapter<S>` / `tutorial<S>` / `step<S>` (#17).
- **`tutorial-forge doctor --setup`** actually runs `adapter.setup` once and tears it down, catching the "reachable but pointed at the wrong database" class of failure — a green reachability check followed by a guaranteed sign-in failure — before you wait out a whole render. Exposed programmatically as `probeAdapterSetup(adapter)`. Off by default (it seeds + signs in for real) (#19).

Docs:

- The "Settling" section now warns that `settleUntil: 'networkidle'` races React `startTransition`-deferred Server Actions (the standard Next.js App Router mutation) and steers those to `waitFor` on the committed UI (#18).
- Adapters docs gain a `ctx.state` section and a teardown-coverage matrix spelling out which hooks run on each path — clean finish, step failure, setup failure, `preview`, `doctor --setup` (#23).

## 0.9.0 — authoring loop

Ergonomics from a real-world dogfooding pass (authoring tutorials for the umami app). Additive for normal use — existing tutorials and adapters render unchanged. One type-only caveat: `StepContext` gained a required `onTeardown` method, so if you construct a `StepContext` object literal yourself (e.g. in a test harness) it now needs that field; code that merely *receives* `ctx` in a callback is unaffected.
Expand Down
261 changes: 64 additions & 197 deletions ISSUE-ACTION-REPORT.md

Large diffs are not rendered by default.

57 changes: 55 additions & 2 deletions docs/adapters.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,59 @@ export default tutorial('Create your first event', steps, {
});
```

Run order is **adapter.setup → tutorial.setup** going in, and **step `onTeardown` thunks (LIFO) → tutorial.teardown → adapter.teardown** coming out. Teardown runs even when a step fails mid-render, so data created before the failure is still cleaned up. Tutorials without hooks keep working through the adapter alone.
Run order is **adapter.setup → tutorial.setup** going in, and **step `onTeardown` thunks (LIFO) → tutorial.teardown → adapter.teardown** coming out. Tutorials without hooks keep working through the adapter alone.

The teardown chain runs on **every** exit path of a render — clean finish, a step failure, *and* a failure inside `adapter.setup`/`tutorial.setup` — so data created (and any `ctx.onTeardown` registered) before a throw is always cleaned up, never leaked into a shared test DB. Each hook is guarded: a teardown that runs against half-built state (e.g. after a setup failure) logs a warning instead of masking the original error. Because of this, **write teardown hooks to tolerate partial setup** (null-check what you delete).

### Sharing state between the adapter and a tutorial

`tutorial.setup` and steps usually need what `adapter.setup` established — the signed-in identity, a seeded id. Return it from `adapter.setup` and it lands on **`ctx.state`**, a per-render bag (scoped to one render, so it's parallel-safe). No module-global handoff, no `!` assertions:

```ts
interface Seed { steward: Person }

export const adapter: TutorialAdapter<Seed> = {
baseURL: 'http://localhost:3000',
async setup(page) {
const steward = await seedSteward();
await signIn(page, steward);
return { steward }; // → ctx.state
},
async teardown(page, ctx) {
await deletePerson(ctx.state.steward.id);
},
};

// Read it in the tutorial — typed via tutorial<Seed>/step<Seed>:
export default tutorial<Seed>('Send a broadcast', [
step<Seed>('Create an event.', async (page, ctx) => {
await createEvent(page, ctx.state.steward);
const id = new URL(page.url()).pathname.split('/').pop()!;
ctx.state.eventId = id; // steps can stash live-created ids…
ctx.onTeardown(() => deleteEvent(id)); // …for their own cleanup
}),
], {
async setup(page, ctx) {
await seedEventFor(ctx.state.steward); // reads what the adapter established
},
});
```

A teardown thunk's return value is awaited and discarded, so value-returning one-liners work directly: `ctx.onTeardown(() => Promise.all(people.map((p) => deletePerson(p.id))))`.

### Teardown coverage matrix

Which hooks run on each path:

| Path | step `onTeardown` | `tutorial.teardown` | `adapter.teardown` |
|---|---|---|---|
| Render — clean finish | ✓ | ✓ | ✓ |
| Render — a step fails | ✓ (for completed steps) | ✓ | ✓ |
| Render — `adapter.setup`/`tutorial.setup` throws | ✓ (registered before the throw) | ✓ | ✓ |
| `preview <step>` (any outcome) | ✓ | ✓ | ✓ |
| `doctor --setup` | ✓ | n/a (no tutorial) | ✓ |

`preview` reaches partial, mid-tutorial state yet still runs the full chain — it's run repeatedly while tuning a step, so leaving the adapter seed behind would quietly fill the DB. (This is why teardown hooks must tolerate partial setup.)

For data a *step* creates mid-tutorial, register cleanup inline with `ctx.onTeardown()` instead of tracking it in the adapter:

Expand All @@ -67,4 +119,5 @@ Thunks run in reverse registration order, so the last thing created is the first
- **Seed deterministic state.** Same inputs should produce the same video. Create fixture data with fixed names/dates; avoid "3 minutes ago"-style relative content where you can.
- **Run your app yourself.** The pipeline does not start your dev server; do that before `tutorial-forge render` (or in your CI job before the render step).
- **Keep secrets in env vars.** The adapter is plain code in your repo; read credentials from the environment, as in any e2e test.
- **Teardown failures are non-fatal.** They log a warning and the render still succeeds — teardown runs after the manifest is final.
- **Teardown failures are non-fatal, and must tolerate partial setup.** They log a warning and the render still succeeds — teardown runs after the manifest is final, and also after a *failed* setup, so null-check what you delete.
- **Verify setup before a full render.** `tutorial-forge doctor` checks the app is reachable; add `--setup` to actually run `adapter.setup` once and tear it down. It catches the "reachable but pointed at the wrong database" case — a green reachability check followed by a guaranteed sign-in failure — before you wait out a whole render.
2 changes: 1 addition & 1 deletion docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ You also need `ffmpeg` and `ffprobe` (≥ 6) on PATH. Verify everything with:
pnpm exec tutorial-forge doctor
```

Run this from your project directory so `doctor` can read `forge.config.ts` and probe that the app at your adapter's `baseURL` is up — the most common render failure is simply forgetting to start the dev server.
Run this from your project directory so `doctor` can read `forge.config.ts` and probe that the app at your adapter's `baseURL` is up — the most common render failure is simply forgetting to start the dev server. Add `--setup` (`tutorial-forge doctor --setup`) to go one step further and actually run `adapter.setup` once, tearing it down afterward — this catches a reachable-but-mispointed server (e.g. the dev server on the wrong database) that would otherwise pass reachability and then fail every render at sign-in.

> **pnpm note:** if `pnpm exec tutorial-forge …` trips pnpm's ignored-builds pre-check, invoke the bin directly (`./node_modules/.bin/tutorial-forge …`) or approve the build with `pnpm approve-builds`.

Expand Down
20 changes: 18 additions & 2 deletions docs/writing-tutorials.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ After a step's action, the recording needs to wait until the app has visually ca
| | What it waits on | Use when |
|---|---|---|
| **`waitFor`** | A DOM signal you choose (a locator appears, text changes, a spinner detaches) | There's a specific element/state that marks "done". This is the principled default — it waits exactly as long as needed, no more. |
| **`settleUntil`** | A page load-state signal (`networkidle` / `load` / `domcontentloaded`) | There's *no* clean DOM signal — e.g. a `router.refresh()` that just repaints, or a navigation whose result you don't want to assert on. `'networkidle'` waits for in-flight requests to quiesce. |
| **`settleUntil`** | A page load-state signal (`networkidle` / `load` / `domcontentloaded`) | There's *no* clean DOM signal — e.g. a `router.refresh()` that just repaints, or a navigation whose result you don't want to assert on. `'networkidle'` waits for in-flight requests to quiesce. **Not** for `startTransition`-deferred Server Actions — see the warning below. |
| **`settleMs`** | A fixed number of milliseconds | A last resort, or a deliberate on-screen beat *after* readiness (e.g. let an animation finish, or hold the final frame a touch longer). |

Rule of thumb: **if you can name the thing you're waiting for, use `waitFor`.** If the only signal is "the network went quiet," use `settleUntil: 'networkidle'`. Only fall back to `settleMs` for a deliberate pause or when nothing else applies — and keep it small.
Expand All @@ -86,6 +86,22 @@ step('The list refreshes with your new row.', async (page) => {

`settleUntil` is **best-effort and bounded** (~5s): a page that never goes idle — websockets, polling, server-sent events — logs and proceeds rather than failing the render, so it's safe to use even when you're not sure the app quiesces. `settleUntil` and `settleMs` compose: the signal-based wait happens first, then `settleMs` adds its on-screen hold.

> **`networkidle` races React `startTransition` — prefer `waitFor` for deferred mutations.** The standard Next.js App Router mutation — a controlled input whose `onChange` runs `startTransition(async () => { await someServerAction(); router.refresh() })` — does **not** dispatch its request synchronously. At the instant your `selectOption`/`click` returns, React hasn't fired the action's fetch yet, so the page is *momentarily* network-idle and `settleUntil: 'networkidle'` can resolve against the **pre-mutation** UI (the screenshot shows the old value). It often lands correctly by luck (the request is usually in flight by the time Playwright polls), but it's timing-dependent. Don't reach for `networkidle` here — wait on the committed result instead:
>
> ```ts
> // ❌ races the transition — can settle before the server action even dispatches
> step('Switch the status to Tickets open.', async (page) => {
> await page.getByLabel('Status').selectOption('open');
> }, { settleUntil: 'networkidle' })
>
> // ✅ wait on the committed UI — exactly as long as the repaint takes, no race
> step('Switch the status to Tickets open.', async (page) => {
> await page.getByLabel('Status').selectOption('open');
> }, { waitFor: (page) => page.getByText('Tickets open').waitFor() })
> ```
>
> Reserve `settleUntil: 'networkidle'` for cases with no committed-state signal to name — a `router.refresh()`/reload that only repaints existing data, or a navigation whose result you don't assert on.

## Iterating on a step

A passing render only proves your selectors *resolved* — not that the right thing was on screen. A step that locates a wrong-but-valid element (or one scrolled off-screen) succeeds silently. Two helpers close that gap without re-recording the whole tutorial every cycle:
Expand All @@ -99,7 +115,7 @@ tutorial-forge preview set-status --only my-tutorial

> Prior-step state is reached by replaying earlier `run()`/`waitFor()` callbacks in order — exactly what a real render does — so anything those steps set up (navigation, form state, seeded data) is present. Only the target step's `settleMs` hold is honored; intermediate pacing is skipped for speed.

**`tutorial-forge render --contact-sheet`** keeps a settled screenshot per step and emits a labeled grid PNG next to the video (`<name>-contact-sheet.png`), one thumbnail per step tagged with its id and narration. Scan it to confirm every step framed the right thing at a glance, instead of scrubbing the video. Enable it persistently with `contactSheet: true` in `forge.config.ts`.
**`tutorial-forge render --contact-sheet`** keeps a settled screenshot per step and emits a labeled grid PNG next to the video (`<name>-contact-sheet.png`), one thumbnail per step tagged with its id and narration. Scan it to confirm every step framed the right thing at a glance, instead of scrubbing the video. Enable it persistently with `contactSheet: true` in `forge.config.ts`. If a step fails mid-render, you still get a **partial** sheet of the steps that completed plus the failure frame as the last cell — the at-a-glance view of how the run got to the failure.

## Timing manifest

Expand Down
2 changes: 1 addition & 1 deletion packages/cli/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "tutorial-forge-cli",
"version": "0.9.0",
"version": "0.10.0",
"description": "CLI for tutorial-forge: render scripted Playwright walkthroughs into narrated tutorial videos",
"type": "module",
"license": "MIT",
Expand Down
34 changes: 30 additions & 4 deletions packages/cli/src/doctor.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { existsSync } from 'node:fs';
import { ffmpegVersion } from 'tutorial-forge';
import { ffmpegVersion, probeAdapterSetup, type ForgeConfig } from 'tutorial-forge';
import { loadConfig } from './load.js';

interface Check {
Expand Down Expand Up @@ -36,7 +36,7 @@ async function probeReachable(baseURL: string, timeoutMs = 3000): Promise<Check>
}
}

export async function doctorCommand(opts: { config?: string } = {}): Promise<void> {
export async function doctorCommand(opts: { config?: string; setup?: boolean } = {}): Promise<void> {
const checks: Check[] = [];

const nodeMajor = parseInt(process.versions.node.split('.')[0]!, 10);
Expand Down Expand Up @@ -81,17 +81,43 @@ export async function doctorCommand(opts: { config?: string } = {}): Promise<voi
// App reachability: only when we can resolve a baseURL from the config.
// A missing/broken config is the render command's problem to report, not
// doctor's — here it just means we skip the probe rather than fail.
let config: ForgeConfig | null = null;
try {
const config = await loadConfig(process.cwd(), opts.config);
checks.push(await probeReachable(config.adapter.baseURL));
config = await loadConfig(process.cwd(), opts.config);
} catch {
/* no config — handled below */
}
if (config) {
checks.push(await probeReachable(config.adapter.baseURL));
} else {
checks.push({
name: 'app reachable',
ok: true,
detail: 'skipped (no forge.config found — run from a project to probe baseURL)',
});
}

// Opt-in (--setup): actually run adapter.setup and tear it down, catching the
// wrong-database / unseedable-target class of failure that a reachable but
// mispointed dev server hides behind a green check. Off by default because it
// seeds + signs in for real (#19).
if (opts.setup) {
if (config) {
try {
await probeAdapterSetup(config.adapter);
checks.push({ name: 'adapter.setup', ok: true, detail: 'ran and tore down cleanly' });
} catch (err) {
checks.push({
name: 'adapter.setup',
ok: false,
detail: `${err instanceof Error ? err.message : String(err)} — is the dev server pointed at the right database?`,
});
}
} else {
checks.push({ name: 'adapter.setup', ok: true, detail: 'skipped (no forge.config found)' });
}
}

let failed = false;
for (const c of checks) {
console.log(`${c.ok ? '✓' : '✗'} ${c.name.padEnd(22)} ${c.detail}`);
Expand Down
1 change: 1 addition & 0 deletions packages/cli/src/main.ts
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ program
.command('doctor')
.description('check the environment (node, ffmpeg, playwright, TTS env vars, app reachability)')
.option('--config <path>', 'path to forge.config.ts')
.option('--setup', "also run the adapter's setup() and tear it down, to catch a wrong-database/unseedable target")
.action(doctorCommand);

program
Expand Down
2 changes: 1 addition & 1 deletion packages/core/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "tutorial-forge",
"version": "0.9.0",
"version": "0.10.0",
"description": "Turn scripted Playwright walkthroughs into narrated tutorial videos (MP4). Tutorials are source code: re-render instead of re-recording when your UI changes.",
"type": "module",
"license": "MIT",
Expand Down
4 changes: 2 additions & 2 deletions packages/core/src/i18n.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ import { logger } from './util/logger.js';
* - A language with no table at all throws: rendering a wholly untranslated
* tutorial under a language suffix would silently produce a mislabeled video.
*/
export function localizeTutorial(tutorial: Tutorial, lang: string, defaultLang = 'en'): Tutorial {
export function localizeTutorial<S = unknown>(tutorial: Tutorial<S>, lang: string, defaultLang = 'en'): Tutorial<S> {
if (lang === defaultLang) return tutorial;

const table = tutorial.translations?.[lang];
Expand Down Expand Up @@ -53,6 +53,6 @@ export function localizeTutorial(tutorial: Tutorial, lang: string, defaultLang =
}

/** Languages a tutorial can render in: the default language plus every translation table. */
export function availableLanguages(tutorial: Tutorial, defaultLang = 'en'): string[] {
export function availableLanguages(tutorial: Pick<Tutorial, 'translations'>, defaultLang = 'en'): string[] {
return [defaultLang, ...Object.keys(tutorial.translations ?? {}).filter((l) => l !== defaultLang)];
}
1 change: 1 addition & 0 deletions packages/core/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ export {
type ContactSheetStyle,
} from './pipeline/contact-sheet.js';
export { previewStep, type PreviewOptions, type PreviewResult } from './pipeline/preview.js';
export { probeAdapterSetup } from './pipeline/preflight.js';

export { SilentProvider } from './tts/silent.js';
export { Piper, type PiperOptions } from './tts/piper.js';
Expand Down
Loading
Loading