diff --git a/docs/getting-started.md b/docs/getting-started.md index 1d163a5..75e1c02 100644 --- a/docs/getting-started.md +++ b/docs/getting-started.md @@ -77,6 +77,8 @@ tutorial-forge render # render everything tutorial-forge render --only my-id # iterate on one tutorial tutorial-forge render --headed # watch the browser while recording tutorial-forge render --phase post # re-merge without re-recording (uses .forge//) +tutorial-forge render --out-dir dist # override config.outDir for this run +tutorial-forge render --concurrency 2 # cap parallel TTS synthesis (overrides config.ttsConcurrency) tutorial-forge clean --cache # remove work dirs and the TTS cache ``` diff --git a/ISSUE-ACTION-REPORT.md b/docs/reviews/2026-06-13-issue-action-report.md similarity index 100% rename from ISSUE-ACTION-REPORT.md rename to docs/reviews/2026-06-13-issue-action-report.md diff --git a/docs/reviews/2026-06-13-pedagogy-review.md b/docs/reviews/2026-06-13-pedagogy-review.md new file mode 100644 index 0000000..8428cfd --- /dev/null +++ b/docs/reviews/2026-06-13-pedagogy-review.md @@ -0,0 +1,207 @@ +# Pedagogy review — tutorial-forge (2026-06-13) + +Reviewer: instructional-designer (advisory). Scope: do the docs and authoring model help +authors produce tutorials that *teach*, and where should the tool go next to make the +*tutorials it produces* more effective at training people on complex software? Video craft +and factual accuracy are out of scope (audited separately). Recommendations go to the +product-manager; no issues filed. + +Reviewed: README.md, docs/getting-started.md, docs/writing-tutorials.md, docs/adapters.md, +packages/example-app/tutorials/getting-started.tutorial.ts, tutorial-forge-spec.md (context). + +Evaluated against established multimedia-learning and cognitive-load principles +(Mayer's segmenting, signaling, modality, redundancy, coherence, pre-training, +temporal-contiguity; Sweller's intrinsic/extraneous/germane load; Merrill's +worked-example→practice progression). + +--- + +## Headline + +tutorial-forge is **engineering-excellent and pedagogy-silent.** The *machine* embodies +several of the strongest evidence-based learning principles almost by accident of good +engineering — but the *documentation* teaches authors only how to drive the machine, not +how to write something that teaches. The result: the tool makes it easy to produce a +technically-correct video and gives almost no help producing an *instructionally* good one. +The single biggest lever is that step boundaries are already first-class in the timing +manifest, so most high-impact pedagogy features are cheap additions, not rewrites. + +--- + +## Part 1 — Pedagogy review of the docs + +### What the architecture gets right (credit where due) + +- **Narration-first pacing honors temporal contiguity and coherence.** + writing-tutorials.md:59-61 — narration is synthesized and measured first, and each step + holds on screen at least as long as its line. Audio and the corresponding visual + co-occur (temporal-contiguity principle), and there's no dead air or rushed-past visual. + This is a genuinely strong pedagogical foundation that most screen-recording tools get + wrong (they retime voiceover onto a fixed recording). +- **Signaling is first-class and well-built.** Fake cursor, click callouts, `--zoom` + toward targets, and `opts.focus` to anchor attention (writing-tutorials.md:30, 44-54; + getting-started.md:85) are a textbook implementation of the signaling/cueing principle — + directing the learner's eye to the essential element and suppressing everything else. +- **Extraneous load reduction.** `--idle-speedup` (getting-started.md:89) removes dead + waiting; pre-roll trimming removes setup. Both cut extraneous cognitive load. +- **Modality principle honored (silently).** Narration is audio, not on-screen prose, so + the visual channel isn't competing with on-screen text. Good — but unstated, so an author + who starts pasting paragraphs of on-screen text wouldn't know they're breaking it. +- **Localization respects speech-rate differences** (writing-tutorials.md:155): each + language re-derives pacing from its actual narration duration. Equity *and* correctness. +- **Determinism / seeded state** (adapters.md:119) gives consistent, repeatable worked + examples — the same demo every time. + +### Where the docs steer authors away from teaching (the gaps) + +1. **Narration guidance is one sentence — the highest-leverage teaching surface is + undocumented.** writing-tutorials.md:29 defines narration as "the line spoken over this + step. Plain text (no SSML). May be `''`." That's it. There is no guidance on *how to + write a line that teaches*: explain the *why* not just the *what*, one idea per step, + use signaling language ("notice…", "here…"), write for the ear (short sentences), not + for reading. Ironically the example models this well — + "Give the event a descriptive name. **This is what attendees will see on their + invitations**" (getting-started.tutorial.ts:30) explains *why* — but the docs never + extract that as a principle, so it's luck, not guidance. + +2. **No guidance on step granularity / segmenting** — the single most impactful + multimedia principle. The docs treat a "step" as a purely mechanical unit (a narration + line + an action) and never tell authors how much one step should carry. The example + shows the inconsistency this produces: "open the Events page" is one click + (getting-started.tutorial.ts:14-20) while the Settings step does nav + heading-wait + + checkbox in a single step/line (getting-started.tutorial.ts:60-67). Segmenting theory + says one conceptual chunk per segment, learner able to absorb before the next. Authors + get no help chunking a complex procedure. + +3. **No learning-objective / advance-organizer pattern.** The example opens with a decent + advance organizer in narration ("In this short tour, we will create your first event and + adjust a workspace setting," getting-started.tutorial.ts:5) and closes with a recap + ("Your event is drafted, your workspace is configured…", :70). Both are good practice — + and both are invisible to the docs. There's no `Tutorial.objectives`, no documented + "open with what they'll be able to do, close with a recap" pattern. Pre-training and + advance-organizer effects go unsupported. + +4. **Captions are framed purely as subtitles, with no redundancy nuance.** + getting-started.md:83 — captions are verbatim narration, burn or sidecar. For + accessibility this is correct and should stay the default. But the docs present no + awareness that verbatim captions + identical audio is the *redundancy* condition (can + add load for sound-on learners), and offer no alternative (e.g. key-term captions). The + tension between the redundancy principle and accessibility is real and worth naming. + +5. **The README frames the value proposition as maintenance, not learning.** "How it + compares" (README.md:62-67) and the whole pitch center on *freshness* — tutorials-as- + code, regenerate in CI, never go stale. That's a real and excellent DevEx win, but it + reveals the project's center of gravity: "keep the video from going stale," not "make a + video that teaches well." Nothing here is wrong; it's the strategic gap that Part 2 + addresses. A maintained-but-mediocre tutorial is still mediocre. + +### Inherent limit (not a doc bug, but the ceiling) + +The output is a linear MP4: a **worked example** the learner *watches*. There is no +practice, no retrieval, no learner pacing within the video, no faded guidance — all of +which the evidence says drive durable skill acquisition for procedural software tasks. +This is the natural ceiling of "video," and it's where the most ambitious Part 2 ideas aim. + +--- + +## Part 2 — New directions (prioritized by pedagogical impact) + +Architecture in play: narration-first pipeline; a **timing manifest** with first-class +per-step boundaries, action windows, audio durations, and callout boxes; a post-stage +ffmpeg filter graph; the adapter model. Step boundaries already being in the manifest is +the key lever — several high-impact features are new *consumers* of data that already +exists. + +| # | Direction | Impact | Feasibility | Why it teaches | +|---|---|---|---|---| +| 1 | **Chapters / segmenting** | High | High | Segmenting principle — the strongest, cheapest win | +| 2 | **Teaching-narration guidance + germane lints** | High | High | Multiplies quality of every tutorial authored | +| 3 | **Objective + recap cards (first-class)** | High | Medium | Advance-organizer, pre-training, summary | +| 4 | **Interactive HTML output w/ knowledge checks** | High | Low | Retrieval practice / generative learning — highest ceiling | +| 5 | **Caption modes (verbatim vs key-term)** | Medium | High | Resolves redundancy vs accessibility | +| 6 | **Accessibility-as-pedagogy pass** | Medium | Medium | Universal design helps all learners | +| 7 | **Effectiveness instrumentation** | Medium | Low (core) | Closes the loop: did it actually teach? | +| 8 | **Faded-practice "your turn" mode** | High | Low | Watch→do; the spec is already the checker | + +### 1. Chapters / segmenting — *do this first* +The manifest already has per-step start/end. Emit **MP4 chapter markers** (ffmpeg metadata +/ chapter file) and/or a **WebVTT chapters track** plus a YouTube-style timestamp list in +output. Optionally let authors group steps into named sections (a `section?: string` on +step, or a `chapter('name', [...steps])` grouping). This gives learners navigation and +self-pacing *within* a single video — the segmenting effect, one of the largest and most +reliable in the literature — at near-zero cost. Highest impact-to-effort ratio in the list. + +### 2. Teaching-narration guidance + germane lints +Add a "Writing narration that teaches" section to writing-tutorials.md (right after the +one-line narration bullet at :29): one idea per step; explain *why*, not only *what*; use +signaling language; open with an objective, close with a recap; write for the ear. Then +back the most objective of these with **optional load-time warnings** — the validation +infra already runs at spec load (writing-tutorials.md:182-183): warn on over-long +narration per step (overload), steps bundling multiple unrelated actions (segmenting), and +a missing intro/recap step. Cheap; multiplies the quality of every tutorial anyone authors. + +### 3. Objective + recap cards as first-class artifacts +Add `objectives?: string[]` (and/or `summary?`) to the `Tutorial` type and render an +optional **title/objective card** at the start and a **recap card** at the end. The spec +already lists intro/outro cards as a "deferred-but-designed-for" post stage +(tutorial-forge-spec.md:223), and the caption system already renders browser HTML → +composited frames, so the rendering path exists. Serves the advance-organizer, +pre-training, and summary principles, and makes the good practice the example already +models (getting-started.tutorial.ts:5, :70) explicit and reusable rather than incidental. + +### 4. Interactive HTML output with knowledge checks — *the high-ceiling bet* +A pure MP4 can't do retrieval practice, the single biggest driver of durable learning. A +new **HTML output target** that wraps the rendered video with chapter navigation (from #1) +and authored **knowledge-check questions between chapters** would add generative/retrieval +activity — the thing video structurally can't. The v1 spec explicitly excluded HTML demos +(tutorial-forge-spec.md:21); this is the right time to revisit that as a *complement*, not +a replacement. Biggest architectural lift here, so frame it as a longer-term initiative — +but the timing manifest makes the chaptered player itself very feasible; the questions are +the new authoring surface. + +### 5. Caption modes: verbatim vs. key-term +Offer `captions: 'full' | 'keyterms' | 'off'` (today it's effectively full-or-off, +getting-started.md:83). `keyterms` surfaces only signaled labels/vocabulary instead of the +full sentence — reducing redundancy load for sound-on learners while still teaching the +vocabulary (a pre-training flavor), with `full` remaining the accessible default. Feasible: +caption text is already per-step. + +### 6. Accessibility-as-pedagogy pass +Guidance + light defaults: never narrate "click here" — name *what* and *where* (also +serves sound-off and low-vision learners); legible caption defaults (styling already +exists via `captionStyle`); flag steps whose narration is purely deictic. Universal-design +moves measurably help *all* learners, not only those who need them. + +### 7. Effectiveness instrumentation +The project today optimizes "the video doesn't go stale" but can't answer "did it teach?" +If output ships with chapter markers (#1) and an analytics-friendly player (#4), consumers +can measure per-chapter drop-off, replays, and completion — turning segment boundaries into +a learning signal. Out of scope for the core MP4 encoder, but the natural strategic +destination: close the loop from *maintained* to *effective*. + +### 8. Faded-practice "your turn" mode — *the striking architectural synergy* +Procedural skill comes from watch→do, not watch alone. Because a tutorial is *already an +executable Playwright spec*, the same spec could generate a **guided-practice mode**: run +the real app, prompt the learner to perform each step themselves, and validate with the +same locators the demo uses. The spec is simultaneously the demonstration *and* the checker +— a synergy no recording-based tool can match. Big lift, research-grade, but the highest +learning ceiling in this list, paired with #4. + +--- + +## Recommendation to the product-manager + +- **Now / cheap / high-impact:** #1 (chapters) and #2 (narration guidance + lints). Both + are small, ride existing infrastructure (manifest, load-time validation, docs), and lift + every tutorial. Recommend filing these as near-term issues. +- **Next:** #3 (objective/recap cards) and #5 (caption modes) — modest post-stage and + config work, clear pedagogical payoff. +- **Strategic bets (RFC / spike first):** #4 (interactive HTML + knowledge checks), + #8 (faded-practice mode), #7 (instrumentation). These move the product from "videos that + don't go stale" to "training that demonstrably works," and #4/#8 are where the largest + learning gains live. They warrant a design doc before committing. + +The throughline: the engine already does the hard, load-bearing pedagogy (pacing, +signaling, coherence). The cheapest wins are teaching authors to use it well (#2) and +surfacing the structure the manifest already knows (#1, #3). The biggest wins are adding +the one thing video fundamentally lacks — learner activity (#4, #8). diff --git a/docs/reviews/2026-06-13-report-for-john.md b/docs/reviews/2026-06-13-report-for-john.md new file mode 100644 index 0000000..f15dfcc --- /dev/null +++ b/docs/reviews/2026-06-13-report-for-john.md @@ -0,0 +1,114 @@ +# Report for John — docs cleanup, pedagogy review & backlog triage + +**Date:** 2026-06-13 +**Prepared by:** Claude Code, with the product-manager and instructional-designer agents + +This session ran end-to-end: a documentation audit → fixes → an instructional-design +review of the docs → backlog triage of the resulting ideas. Here's everything that +happened and where it leaves you. + +--- + +## 1. Documentation audit & fixes + +A product-manager audit found the **user-facing docs are current and accurate** with the +shipped 0.10.0 surface (README, the three `docs/` guides, package READMEs, CHANGELOG, +agents README — no drift). Two stray, non-user-facing artifacts were the only real gaps, +both now fixed: + +| Fix | What changed | Issue | +|---|---|---| +| Historical-spec banner | Added a prominent "Historical design document" banner to `tutorial-forge-spec.md` — marks it pre-v0.1, settles final names (`tutorial-forge` / `tutorial-forge-cli` / bin `tutorial-forge`+`tforge`), points to README/docs/CHANGELOG as authoritative | #33 (closed) | +| Stray review report moved | `ISSUE-ACTION-REPORT.md` → `docs/reviews/2026-06-13-issue-action-report.md` | #34 (closed) | +| Render-flag coverage | Documented `--out-dir` and `--concurrency` in `docs/getting-started.md` (verified against `render.ts`) | — | + +Repo root is now clean: `README.md`, `CHANGELOG.md`, `tutorial-forge-spec.md`. +**These changes are in the working tree, uncommitted.** + +--- + +## 2. Pedagogy review (instructional-designer) + +Full review: [`docs/reviews/2026-06-13-pedagogy-review.md`](2026-06-13-pedagogy-review.md). + +**Headline: the engine is pedagogy-strong; the docs are pedagogy-silent.** The pipeline +already embodies several of the strongest evidence-based learning principles by good +engineering — but the documentation teaches authors how to drive the machine, not how to +write something that *teaches*. + +**Already right (credit):** narration-first pacing (temporal contiguity + coherence); +cursor/callouts/zoom/focus (signaling); idle-speedup + pre-roll trim (extraneous-load +reduction); per-language pacing (equity). + +**Gaps:** +1. Narration guidance is *one sentence* — the highest-leverage teaching surface is + undocumented. +2. No guidance on **step granularity / segmenting** (the most impactful principle); the + example tutorial is itself inconsistent. +3. No **objective / recap** pattern documented (the example models both but never names + them). +4. Captions framed purely as verbatim subtitles — no redundancy-vs-accessibility nuance. +5. The README pitch centers on *freshness/maintenance*, not *learning* — the strategic gap. + +**Inherent ceiling:** the output is a linear MP4 — a worked example you *watch*. No +practice, no retrieval, no learner pacing. That ceiling is where the boldest ideas aim. + +--- + +## 3. Backlog triage (product-manager) + +The PM spot-checked the review's feasibility claims against the code (all three held: the +manifest carries per-step boundaries `types.ts:188-203`; load-time validation exists +`spec.ts:21-60`; intro/outro cards are pre-designed `spec.md:223` + the card-render path +exists in `post/captions.ts`) and **filed 5 issues**: + +| # | Title | Bucket | Labels | +|---|---|---|---| +| **#35** | Chapters / segmenting from per-step manifest | **Now** | enhancement | +| **#36** | Teaching-narration guidance + germane load-time lints | **Now** | documentation, enhancement | +| **#37** | Objective + recap cards as first-class artifacts | **Next** | enhancement | +| **#38** | Caption modes: verbatim vs. key-term | **Next** | enhancement | +| **#39** | RFC / roadmap: from "maintained" to "effective" | **Later** | enhancement, question | + +**PM judgment calls:** +- The three strategic bets (interactive HTML output, faded-practice mode, effectiveness + instrumentation) were **merged into one RFC/roadmap issue (#39)** — they're + interdependent (all presuppose a shared chaptered web player), not parallel, and reduce + to one upstream decision: *does the project expand beyond video?* +- Accessibility-as-pedagogy (#6 in the review) was **folded into #36** rather than filed + separately (heavy overlap; `captionStyle` already exists). Can be split out if you want + it tracked as its own audit workstream. +- An API correction was baked into the issues: the review said `captions:`; the real field + is `subtitles: 'burn' | 'sidecar' | 'off'` (`types.ts:130`). + +**Recommended sequencing:** #35 chapters first (it's the substrate for #37 card timing, +#39's HTML player, and #39's analytics) → #36 narration docs+lints in parallel (no code +dependency) → #37 cards → #38 caption modes → #39 only after a design doc. + +--- + +## 4. Decisions needed from you (these block work) + +1. **Does tutorial-forge's scope expand beyond video?** Gates #39. An HTML player and a + live practice runner both reverse the explicit v1 non-goal "No interactive/HTML demos" + (`spec.md:21`). Product-identity call — nothing in #39 starts until you decide. +2. **Lint default-on vs. opt-in** (#36). PM rec: over-long-narration on by default + (objective, high signal); missing-intro/recap behind opt-in strict mode (heuristic, + noisier). +3. **Caption-modes API shape** (#38). PM rec: a new orthogonal option + (`captionText: 'verbatim' | 'keyterms'`) keeping delivery (`subtitles`) separate from + content, vs. overloading the `subtitles` enum. +4. **Cards: visual-only or narrated?** (#37). PM rec: visual-only in v1 — authors already + speak the organizer in step-1 narration, so narrating cards too would manufacture the + redundancy the review warns against. + +--- + +## Suggested next step + +If you're happy with the direction, the cheapest high-impact move is to greenlight +**#35 (chapters)** and **#36 (narration guidance + lints)** — both ride existing +infrastructure and lift every tutorial. #39 is parked behind decision #1. + +Also pending: the documentation fixes in §1 are **uncommitted** — say the word and I'll +commit them (and these review/report files). diff --git a/tutorial-forge-spec.md b/tutorial-forge-spec.md index d333f15..34094c3 100644 --- a/tutorial-forge-spec.md +++ b/tutorial-forge-spec.md @@ -1,6 +1,8 @@ # tutorial-forge — Specification -> **Working name:** `tutorial-forge` (npm scope TBD, e.g. `@jb/tutorial-forge`). Rename freely; nothing below depends on the name. +> **⚠️ Historical design document.** This is the pre-v0.1 design spec, kept for context on *why* the system is shaped the way it is. It is **not** current user documentation and some details have since changed (final package names, the full CLI flag set, the dropped Umami framing). For authoritative, up-to-date guidance see **[README.md](README.md)**, the **[docs/](docs/)** guides (`getting-started`, `writing-tutorials`, `adapters`), and **[CHANGELOG.md](CHANGELOG.md)**. Where this document and those disagree, those win. +> +> **Names as shipped:** library `tutorial-forge`, CLI package `tutorial-forge-cli`, binary `tutorial-forge` (alias `tforge`). The original spec used a `forge` binary throughout; read the `forge ` examples below as `tutorial-forge `. ## 1. What this is