Skip to content

fix(docs-site): stop ignoring version snapshots; assert it stays that way#2313

Merged
jeffredodd merged 3 commits into
mainfrom
fix/publish-docs-stage-version-snapshots
Jun 30, 2026
Merged

fix(docs-site): stop ignoring version snapshots; assert it stays that way#2313
jeffredodd merged 3 commits into
mainfrom
fix/publish-docs-stage-version-snapshots

Conversation

@jeffredodd

@jeffredodd jeffredodd commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Summary

docs-site/.gitignore lists versions.json, versioned_docs/, and versioned_sidebars/ as ignored. Added in #2167 with the intent of keeping local docusaurus docs:version scratch out of SDK PRs. But that same .gitignore syncs downstream into Gusto/embedded-sdk-docs, where those files are the tracked archive — and the sync workflow's preserve-then-restore-then-git add flow silently drops them on every run since.

Net effect: the version picker on sdk.gusto.com has been gone since June 16. Snapshot data has been silently deleted from embedded-sdk-docs/main for two weeks.

How docs versioning is split between repos

Latest in the SDK, archive in the docs repo. The SDK is purely the content producer; Gusto/embedded-sdk-docs carries the version archive + deployment.

File / directory SDK repo (embedded-react-sdk) Docs repo (embedded-sdk-docs)
docs/ (live current) ✅ Source of truth — what the team edits ✅ Mirror, refreshed on every sync
docs-site/ (config, plugins, theme) ✅ Source of truth ✅ Mirror
docs-site/versions.json ❌ (never created in normal SDK dev) ✅ The archive index
docs-site/versioned_docs/version-X.Y/ ❌ (never created in normal SDK dev) ✅ Frozen snapshots, one per cut minor
docs-site/versioned_sidebars/version-X.Y-sidebars.json ❌ (never created in normal SDK dev) ✅ Frozen sidebars
Built gh-pages site n/a ✅ Output of docusaurus build

docs-site/docusaurus.config.ts:11–25 is identical in both repos. It checks versions.json at runtime — if absent, single-version site; if present, multi-version with picker. Snapshots are created by npx docusaurus docs:version <X.Y>, which publish-docs.yaml invokes inside the docs repo's working tree during sync — never in the SDK.

Root cause

The sync workflow does this each run:

  1. Preserve versioned_docs/ / versioned_sidebars/ / versions.json to /tmp
  2. git rm -rf docs-site — removes them from the index ✓
  3. cp -R ../docs-site . — brings in the SDK's .gitignore 💥
  4. Restore the preserved files into the working tree ✓
  5. git add docs docs-site .nvmrc — silently skips them (gitignored after git rm, now untracked) 💥

Smoking gun: cbb5e90d — first sync after #2167 — deleted versions.json and every file under versioned_docs/version-0.47/.

Fix (two parts)

1. Remove the ignore entries from docs-site/.gitignore (root cause)

The protection they provided (preventing accidental local-scratch commits in SDK PRs) was guarding a workflow that's already architecturally wrong — versioning lives downstream. If someone does run docusaurus docs:version in the SDK locally, git status shows 138+ untracked files and review catches it.

2. Defense-in-depth: assert the rules can't come back undetected

Added a git check-ignore precondition at step 3a in publish-docs.yaml. If anyone ever re-introduces ignore rules for the three downstream-owned paths, the sync fails loudly with a pointer to the fix instead of silently corrupting the archive:

for path in docs-site/versions.json docs-site/versioned_docs docs-site/versioned_sidebars; do
  if git check-ignore -q "$path"; then
    echo "ERROR: $path is gitignored by the synced docs-site/.gitignore." >&2
    echo "Cross-repo sync requires this path to be tracked downstream." >&2
    echo "Remove the corresponding entry from the SDK's docs-site/.gitignore." >&2
    exit 1
  fi
done

Considered force-add in the workflow as an alternative; rejected because force-add silently recovers from gitignore drift, which is the same failure mode that caused this incident (silent data loss). The assertion makes the architectural invariant explicit and surfaces regressions immediately.

Why this approach over the original PR shape

The first version added a workflow git add -f block to bypass the gitignore. That worked, but Marie noted: the downstream .gitignore would still claim to ignore paths the repo tracks. That's exactly the kind of "rule that doesn't match reality" incoherence that caused this incident. Cleaner to fix at the source AND add a check that surfaces any future regression.

Dry-run evidence

Ran the workflow body line-for-line locally against Gusto/embedded-sdk-docs@origin/main, using the SDK at this PR's commit (gitignore cleaned, assertion in place). Plain git add docs docs-site .nvmrc — no force-add.

Scenario Trigger MINOR Snapshot action Result
Bootstrap (run manually after merge) workflow_dispatch 0.48 create Assertion OK → 287 files staged → 139 version-related → versions.json = ["0.48"]
Next minor (simulated 0.49.0) workflow_run 0.49 create Assertion OK → 139 new staged → versions.json = ["0.49", "0.48"]
Patch release (simulated 0.48.4) workflow_run 0.48 refresh Assertion OK → refresh ran (137 files walked, 0 orphans, 0 net changes)
Negative case — gitignore re-broken workflow_dispatch 0.48 n/a Assertion fires with exact error: "ERROR: docs-site/versions.json is gitignored…" → workflow aborts before staging

Reproduction

# Dry-run script — mirror of publish-docs.yaml's bash, no token, no push
cat > /tmp/publish-docs-dryrun.sh <<'SCRIPT'
#!/usr/bin/env bash
set -euo pipefail
SDK_SOURCE="${SDK_SOURCE:-$HOME/workspace/embedded-react-sdk}"
DOCS_REMOTE="${DOCS_REMOTE:-$HOME/workspace/embedded-sdk-docs}"
WORK_DIR="${WORK_DIR:-/tmp/dryrun-work}"
TRIGGER="${TRIGGER:-workflow_dispatch}"
SDK_VERSION="${SDK_VERSION:-$(node -p "require('${SDK_SOURCE}/package.json').version")}"
MINOR="${MINOR:-$(echo "$SDK_VERSION" | cut -d. -f1,2)}"

rm -rf "$WORK_DIR" && mkdir -p "$WORK_DIR" && cd "$WORK_DIR"
git clone "$DOCS_REMOTE" docs-repo && cd docs-repo
git checkout origin/main 2>/dev/null || git checkout main 2>/dev/null || true
git config user.name "gusto-embedded-docs[bot]"
git config user.email "gusto-embedded-docs[bot]@users.noreply.github.com"

git rm -rf --quiet --ignore-unmatch docs .nvmrc
mkdir -p /tmp/preserved && rm -rf /tmp/preserved/*
for v in versioned_docs versioned_sidebars versions.json; do
  [ -e "docs-site/$v" ] && mv "docs-site/$v" "/tmp/preserved/$v"
done
git rm -rf --quiet --ignore-unmatch docs-site

cp -R "$SDK_SOURCE/docs" .
cp -R "$SDK_SOURCE/docs-site" .
cp "$SDK_SOURCE/.nvmrc" .
rm -rf docs-site/node_modules docs-site/build

# Step 3a precondition (the new defense-in-depth check)
for path in docs-site/versions.json docs-site/versioned_docs docs-site/versioned_sidebars; do
  if git check-ignore -q "$path"; then
    echo "ERROR: $path is gitignored" >&2; exit 1
  fi
done

for v in versioned_docs versioned_sidebars versions.json; do
  [ -e "/tmp/preserved/$v" ] && mv "/tmp/preserved/$v" "docs-site/$v"
done

MINOR_IN_VERSIONS=false
[ -f docs-site/versions.json ] && jq -e --arg key "${MINOR}" 'index($key) != null' docs-site/versions.json >/dev/null 2>&1 && MINOR_IN_VERSIONS=true
SNAPSHOT_ACTION=none
if [ "$TRIGGER" = "workflow_run" ] || [ "$TRIGGER" = "workflow_dispatch" ]; then
  [ "$MINOR_IN_VERSIONS" = "false" ] && SNAPSHOT_ACTION=create || { [ "$TRIGGER" = "workflow_run" ] && SNAPSHOT_ACTION=refresh; }
fi
echo "Snapshot action: $SNAPSHOT_ACTION"
if [ "$SNAPSHOT_ACTION" = "create" ]; then
  (cd docs-site && npm ci --silent && npx docusaurus docs:version "${MINOR}")
elif [ "$SNAPSHOT_ACTION" = "refresh" ]; then
  VERSIONED_DIR="docs-site/versioned_docs/version-${MINOR}"
  while IFS= read -r f; do
    rel="${f#${VERSIONED_DIR}/}"; live="docs/${rel}"
    [ -f "$live" ] && cp "$live" "$f"
  done < <(find "$VERSIONED_DIR" -type f)
fi

git add docs docs-site .nvmrc
unexpected=$(git diff --cached --name-only | grep -vE '^(docs/|docs-site/|\.nvmrc$)' || true)
[[ -n "$unexpected" ]] && { echo "ERROR: staged outside allowlist:"; echo "$unexpected"; exit 1; }

echo "Total staged: $(git diff --cached --name-only | wc -l)"
echo "Version-related staged: $(git diff --cached --name-only | grep -cE '^docs-site/(versions\.json|versioned_)')"
echo "versions.json:"; git show :docs-site/versions.json 2>/dev/null
SCRIPT
chmod +x /tmp/publish-docs-dryrun.sh

# Positive: assertion passes, sync proceeds
TRIGGER=workflow_dispatch /tmp/publish-docs-dryrun.sh

# Next minor (commit bootstrap result, simulate 0.49.0):
cd /tmp/dryrun-work/docs-repo && git commit -q -m "dry-run: bootstrap"
DOCS_REMOTE=/tmp/dryrun-work/docs-repo TRIGGER=workflow_run \
  SDK_VERSION=0.49.0 MINOR=0.49 WORK_DIR=/tmp/dryrun-work2 \
  /tmp/publish-docs-dryrun.sh

# Patch refresh:
DOCS_REMOTE=/tmp/dryrun-work/docs-repo TRIGGER=workflow_run \
  SDK_VERSION=0.48.4 MINOR=0.48 WORK_DIR=/tmp/dryrun-work3 \
  /tmp/publish-docs-dryrun.sh

# Negative: temporarily re-add the bad entries, expect assertion to fire
cd "$HOME/workspace/embedded-react-sdk"  # or your SDK clone
printf '\nversions.json\nversioned_docs/\nversioned_sidebars/\n' >> docs-site/.gitignore
TRIGGER=workflow_dispatch /tmp/publish-docs-dryrun.sh  # should fail with the assertion error
git checkout docs-site/.gitignore  # revert

After merge

gh workflow run publish-docs.yaml -R Gusto/embedded-react-sdk -r main

Runs the sync as workflow_dispatch. Since the docs repo currently has no versions.json, SNAPSHOT_ACTION=createnpx docusaurus docs:version 0.48 runs, plain git add stages everything, sync commits and pushes. You'll see a docs: sync ... — created version-0.48 snapshot commit downstream, then a Buildkite publish to gh-pages.

Picker won't show yet (only one version, self-hides on versions.length <= 1) — appears on the next release when version-0.49 lands alongside.

Test plan

  • Repro the bug locally: fresh embedded-sdk-docs@origin/main worktree, docusaurus docs:version 0.48, git add docs-site, confirm nothing staged with current SDK gitignore
  • Verify fix locally: SDK with gitignore entries removed → git add docs-site stages all snapshot files
  • Build downstream with versions.json = ["0.47", "0.48"]: confirm versionSelect__TMD class and Version trigger render in build/docs/index.html
  • Full workflow dry run, workflow_dispatch + MINOR=0.48 (bootstrap): assertion OK, 139 version files staged
  • Full workflow dry run, workflow_run + MINOR=0.49 (next minor): assertion OK, both 0.48 and 0.49 tracked
  • Full workflow dry run, workflow_run + MINOR=0.48 (patch refresh): assertion OK, refresh path executes
  • Negative case: re-add ignore entries → assertion fires with exact error message → workflow aborts before staging
  • After merge, run manual workflow_dispatch and verify the next sync commit in embedded-sdk-docs re-creates versioned_docs/version-0.48/ and versions.json
  • Verify sdk.gusto.com still loads docs (only one version, so picker still hides — picker reappears on the next release)

🤖 Generated with Claude Code

The publish-docs workflow preserves the downstream-owned versioning files
(versions.json, versioned_docs/, versioned_sidebars/) before wiping
docs-site/, then restores them after copying the SDK source over the top.

But the SDK's docs-site/.gitignore (added in PR #2167 alongside the
SidebarVersionSelect component, to keep local `docusaurus docs:version`
scratch from accidentally being committed in the SDK) is itself copied
down with the sync. So the subsequent `git add docs docs-site .nvmrc`
silently skips the restored or newly-created snapshots, and the
already-issued `git rm -rf docs-site` drops them from the index — net
effect: every sync after the .gitignore landed deleted the snapshots
from the docs repo.

Force-add the three known snapshot paths to stage them regardless of
the synced .gitignore. Mirrors the pattern Marie used in PR #2143 for
docs/api/ (`git add -f docs/api`) before that path stopped needing it.

After merging this, a manual `workflow_dispatch` is needed to
re-create the version-0.48 snapshot downstream — subsequent NPM
publishes will then keep snapshots intact automatically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jeffredodd jeffredodd marked this pull request as ready for review June 30, 2026 16:57
@jeffredodd jeffredodd requested a review from a team as a code owner June 30, 2026 16:57
Comment thread .github/workflows/publish-docs.yaml Outdated
Comment on lines +238 to +239
# accidentally land in upstream commits. That same .gitignore is then
# synced down with the rest of docs-site/, so the bare `git add` above

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work if the gitignore synced to the docs repo also includes the paths we're force adding?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — your question pointed at the real issue, not just a follow-up.

I scrapped the force-add and instead removed the version-snapshot entries from docs-site/.gitignore entirely. The downstream .gitignore now reflects what the repo actually tracks, no incoherence. The local-scratch protection those entries provided was already guarding a workflow that shouldn't happen in the SDK (versioning lives downstream) — and git status showing 138 untracked files plus PR review covers any accidental commit.

Also added a git check-ignore precondition at step 3a as defense-in-depth: if anyone ever re-adds those entries (or analogous ones for paths the downstream owns), the sync fails loudly with a pointer to the fix rather than silently dropping snapshots. Tested both directions — passes when gitignore is clean, fires with the exact error when entries are re-added.

PR description updated with full dry-run matrix including the negative case. Thanks for catching it.

…around

Per review feedback (#2313 — thanks Marie), addressing the root cause
rather than working around it.

The previous commit added `git add -f` to the publish-docs workflow to
bypass docs-site/.gitignore for versions.json / versioned_docs/ /
versioned_sidebars/. That worked, but left the downstream `.gitignore`
"lying" — claiming to ignore paths the docs repo actually tracks.
That kind of incoherence is exactly what caused this incident in the
first place (Aaron's ignore looked correct in source isolation, but
broke a sync invariant nobody had encoded).

Cleaner fix: just don't ignore them. The protection these entries
provided (preventing local `docusaurus docs:version` scratch from
accidentally landing in SDK PRs) was guarding a workflow that's
architecturally out-of-place anyway — versioning lives downstream, so
nobody should be generating snapshots locally in the SDK. If someone
does, `git status` will show 138+ untracked files and review catches it.

Re-ran all three dry-run scenarios against the actual embedded-sdk-docs
@origin/main without force-add — all pass:
  - workflow_dispatch + MINOR=0.48 (bootstrap): 287 staged, 139 version-related
  - workflow_run + MINOR=0.49 (next minor): versions.json = ["0.49", "0.48"]
  - workflow_run + MINOR=0.48 (patch refresh): refresh path executes cleanly

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jeffredodd jeffredodd changed the title fix(ci): force-stage version snapshots in publish-docs sync fix(docs-site): stop ignoring version snapshots so the cross-repo sync can carry them Jun 30, 2026
Defense-in-depth check in publish-docs.yaml that runs `git check-ignore`
against the synced docs-site/.gitignore for the three downstream-owned
snapshot paths (versions.json, versioned_docs/, versioned_sidebars/).

If a future SDK change re-introduces ignore rules for any of these (as
#2167 did, silently deleting snapshots from the docs repo for two weeks),
the sync now fails loudly with a clear error message pointing at the
fix — instead of `git add` quietly skipping the restored snapshots and
the workflow happily pushing the deletion downstream.

Fails fast: runs in step 3a, before snapshot creation/refresh, so a
gitignore regression never gets the chance to corrupt the archive.

Tested both directions:
- Positive (gitignore clean, this PR's state): check passes, all 3
  dry-run scenarios (bootstrap, new minor, patch refresh) succeed
- Negative (gitignore entries re-added): check fires with the exact
  error message before any data is staged

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment on lines +172 to +186
# ---- Step 3a: gitignore precondition check ----
# This sync's correctness depends on docs-site/.gitignore NOT
# ignoring the downstream-owned snapshot paths. If the SDK ever
# re-introduces such rules (as #2167 did, causing snapshots to
# silently disappear from the docs repo for two weeks), fail
# loudly here rather than letting `git add` quietly skip the
# restored snapshots below.
for path in docs-site/versions.json docs-site/versioned_docs docs-site/versioned_sidebars; do
if git check-ignore -q "$path"; then
echo "ERROR: $path is gitignored by the synced docs-site/.gitignore." >&2
echo "Cross-repo sync requires this path to be tracked downstream." >&2
echo "Remove the corresponding entry from the SDK's docs-site/.gitignore." >&2
exit 1
fi
done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@jeffredodd jeffredodd changed the title fix(docs-site): stop ignoring version snapshots so the cross-repo sync can carry them fix(docs-site): stop ignoring version snapshots; assert it stays that way Jun 30, 2026
@jeffredodd jeffredodd added this pull request to the merge queue Jun 30, 2026
Merged via the queue into main with commit 5bea4b3 Jun 30, 2026
38 checks passed
@jeffredodd jeffredodd deleted the fix/publish-docs-stage-version-snapshots branch June 30, 2026 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants