lancedb · prrao87 · May 6, 2026 · May 6, 2026 · May 6, 2026
diff --git a/workflows/docs-audit/AGENTS.md b/workflows/docs-audit/AGENTS.md
@@ -6,7 +6,7 @@ This workspace orchestrates a docs-gap audit across external local repos. It doe
 
 - Deterministic scripts live in `scripts/`.
 - Area manifests live in `manifests/`.
-- Codex prompt templates live in `prompts/`.
+- Agent prompt templates live in `prompts/`.
 - Run state lives in `state/`.
 - Generated run artifacts live in `artifacts/`.
 

diff --git a/workflows/docs-audit/README.md b/workflows/docs-audit/README.md
@@ -11,7 +11,7 @@ The goal is to find what is missing from the docs, especially conceptual and imp
 This is a research workflow, not a production service. The design favors:
 
 - compact deterministic preprocessing
-- page-scoped LLM work inside the Codex app
+- page-scoped LLM work by the running agent
 - saved local artifacts for inspection and reuse
 - simple extension through manifests
 
@@ -20,7 +20,7 @@ This is a research workflow, not a production service. The design favors:
 This workspace does not:
 
 - clone or vendor source code from the watched repos
-- attempt to enforce a hard token quota in Codex
+- attempt to enforce a hard token quota in the agent runtime
 - produce doc fixes automatically
 - behave like a production CI system
 
@@ -39,27 +39,28 @@ The audit runner only reads those repos and records refresh status. This workspa
 Each weekly run follows the same sequence:
 
 1. Refresh watched repos with safe fast-forward pulls.
-2. Read the enabled area manifests.
-3. Build deterministic evidence bundles for each page in the selected area.
-4. Compare current evidence fingerprints to the last completed run.
-5. Select pages to audit:
+2. Select a bounded set of enabled area manifests for the weekly run.
+3. Read the selected area manifests.
+4. Build deterministic evidence bundles for each page in the selected area.
+5. Compare current evidence fingerprints to the last completed run.
+6. Select pages to audit:
    - always include pages whose mapped evidence changed
-   - then include one rotating extra page for broader coverage
-   - if no pages changed, the rotating extra page becomes the only selected page
-   - the rotation walks through the pages in manifest order and advances one slot after each completed run
-6. Use Codex LLM passes on the selected page bundles to extract:
+   - then include rotating extra pages for broader coverage
+   - if no pages changed, the rotating extra pages become the selected pages
+   - the rotation walks through the pages in manifest order and advances as rotating pages are added
+7. Use page-scoped LLM passes on the selected page bundles to extract:
    - code claims
    - doc claims
    - candidate gaps and final markdown observations
-7. Save artifacts under a timestamped run directory.
-8. Mark the run complete and update state.
-9. Surface the final markdown report through a Codex inbox item.
+8. Save artifacts under a timestamped run directory.
+9. Mark the run complete and update state.
+10. Surface the final markdown report through an inbox item.
 
 ## Workspace Layout
 
 - `config.toml`: repo paths, enabled areas, selection rules, and output paths
 - `manifests/`: docs-area manifests
-- `prompts/`: reusable Codex prompt templates
+- `prompts/`: reusable agent prompt templates
 - `scripts/`: deterministic extraction, refresh, selection, and state utilities
 - `state/`: lightweight run state and rotation cursor
 - `artifacts/`: per-run evidence bundles, LLM outputs, and reports
@@ -72,18 +73,19 @@ The deterministic layer is responsible for the parts that should not require sem
 
 - refreshing repos with `git pull --ff-only`
 - reading manifests
+- selecting changed enabled areas first, then filling the weekly area budget by rotation
 - selecting source files per page
 - hashing file contents and detecting changed pages
 - extracting compact raw signals from docs pages and code-side surfaces
 - writing page-scoped evidence bundles
-- selecting changed pages plus one rotating extra page
+- selecting changed pages plus rotating extra pages
 - updating local state after a completed run
 
 The deterministic layer intentionally keeps evidence compact so the LLM does not need to read entire files or repos.
 
 ## LLM-Assisted Layer
 
-The semantic layer runs inside Codex through the automation prompt. For each selected page bundle, the LLM should:
+The semantic layer runs through the automation prompt. For each selected page bundle, the LLM should:
 
 1. infer normalized code claims from the evidence bundle
 2. infer normalized doc claims from the docs bundle
@@ -102,7 +104,16 @@ The saved artifacts should include:
 From this workspace root:
 
 ```bash
-uv run python scripts/run_audit.py prepare --area indexing --refresh
+uv run python scripts/run_audit.py select-areas --refresh --advance
+```
+
+This chooses a bounded list of enabled area manifests for the weekly run. The selector uses
+`[area_selection]` in `config.toml`: changed enabled areas are considered first, then any remaining
+weekly slots are filled by rotating through `enabled_areas`. Use the printed `selected_areas` list
+for the per-area `prepare` commands.
+
+```bash
+uv run python scripts/run_audit.py prepare --area indexing
 ```
 
 `--area` is the manifest name, not a hardcoded value in the script. The runner loads:
@@ -112,11 +123,36 @@ uv run python scripts/run_audit.py prepare --area indexing --refresh
 So `--area indexing` maps to `manifests/indexing.toml`. If you add `manifests/search.toml`, you would run:
 
 ```bash
-uv run python scripts/run_audit.py prepare --area search --refresh
+uv run python scripts/run_audit.py prepare --area search
 ```
 
 This creates a pending run directory under `artifacts/pending/<run_id>/` and prints a JSON summary to stdout.
 
+When running after `select-areas --refresh`, omit `--refresh` from `prepare`; the repos were already
+refreshed once for the weekly selection.
+For a standalone one-area audit where you skip `select-areas`, pass `--refresh` to `prepare`.
+
+## Area Selection
+
+`enabled_areas` is the full pool of manifests the weekly automation may audit. The `[area_selection]`
+block controls how many of those enabled manifests are selected for a single weekly run:
+
+```toml
+[area_selection]
+mode = "changed_first_rotate"
+areas_per_run = 2
+```
+
+Supported modes:
+
+- `all`: select every enabled area.
+- `rotate`: ignore changed-area detection and select only by rotating through `enabled_areas`.
+- `changed_first_rotate`: select changed enabled areas first, up to `areas_per_run`, then fill any remaining slots by rotation.
+
+The area rotation cursor is stored in `state/state.json` under `area_selection.rotation_index` when
+you run `select-areas --advance`. Page-level rotation still happens independently inside each
+selected area through `[selection].rotation_extra_pages`.
+
 After the LLM phase writes the expected outputs into that pending run directory, complete the run with:
 
 ```bash
@@ -307,7 +343,7 @@ The runner is designed so new docs areas should generally require a new manifest
 
 ## Weekly Automation
 
-The weekly Codex automation should use this workspace as its cwd and follow `prompts/weekly_automation.md`.
+The weekly automation should use this workspace as its cwd and follow `prompts/weekly_automation.md`.
 
 The automation should:
 

diff --git a/workflows/docs-audit/config.toml b/workflows/docs-audit/config.toml
@@ -5,10 +5,14 @@ enabled_areas = [
     "table-operations",
     "reranking",
     "embeddings",
-    "storage",,
+    "storage",
     "namespaces",
 ]
 
+[area_selection]
+mode = "changed_first_rotate"
+areas_per_run = 2
+
 [repos.lancedb]
 path = "../../../lancedb"
 
@@ -19,7 +23,7 @@ path = "../.."
 path = "../../../sophon"
 
 [selection]
-rotation_extra_pages = 10
+rotation_extra_pages = 5
 prefer_changed_pages = true
 
 [paths]

diff --git a/workflows/docs-audit/prompts/weekly_automation.md b/workflows/docs-audit/prompts/weekly_automation.md
@@ -16,12 +16,17 @@ This workflow also includes manifest maintenance. Before each audit run, review
 - `prompts/page_audit_guidelines.md`
 - `skills/area-manifest-authoring/SKILL.md`
 
-Then read the manifest file for each area listed in `enabled_areas` in `config.toml`.
+Then select the area manifests for this run using the deterministic area selector.
 
 ## Required workflow
 
 1. Read `config.toml` and determine the enabled areas from `enabled_areas`.
-2. For each enabled area, run a manifest maintenance pass before `prepare`.
+2. Select the areas for this weekly run:
+   - `uv run python scripts/run_audit.py select-areas --refresh --advance`
+   - Use the printed `selected_areas` list for the rest of this workflow.
+   - The selector refreshes watched repos once, detects changed enabled areas, and fills the remaining weekly slots by area rotation.
+   - Do not run unselected enabled areas in this weekly pass.
+3. For each selected area, run a manifest maintenance pass before `prepare`.
    - Use `skills/area-manifest-authoring/SKILL.md` as the procedure.
    - Read the current `manifests/<area>.toml`.
    - Check whether the docs area boundary has changed:
@@ -34,33 +39,31 @@ Then read the manifest file for each area listed in `enabled_areas` in `config.t
      - source blocks whose `applies_to` mapping is now too broad or too narrow
    - Keep the manifest compact. Do not add files just because they mention the topic; add them only if they are likely to expose user-visible behavior the docs may be missing.
    - If the manifest changes, save the updated `manifests/<area>.toml` before preparing the run.
-3. Run the deterministic prepare step for each enabled area.
-   - For the first area, refresh the watched repos:
-     - `uv run python scripts/run_audit.py prepare --area <first-area> --refresh`
-   - For subsequent areas in the same weekly run, skip the refresh to avoid repeating `git pull`:
-     - `uv run python scripts/run_audit.py prepare --area <next-area>`
-4. Read the JSON summary printed by each `prepare` command and locate each pending run directory.
+4. Run the deterministic prepare step for each selected area.
+   - Repos were already refreshed by `select-areas`, so skip `--refresh` here:
+     - `uv run python scripts/run_audit.py prepare --area <area>`
+5. Read the JSON summary printed by each `prepare` command and locate each pending run directory.
    - Use the printed `run_dir`; it should point under `artifacts/pending/<run_id>`.
    - Do not create or write directly under `artifacts/runs/<run_id>` before completion.
-5. For each pending run directory, read `selected_pages.json` and the corresponding files in `page_bundles/`.
-6. For each selected page bundle:
+6. For each pending run directory, read `selected_pages.json` and the corresponding files in `page_bundles/`.
+7. For each selected page bundle:
    - apply `prompts/page_audit_guidelines.md` as the page-level review rubric
    - infer normalized code claims from the evidence bundle
    - infer normalized doc claims from the docs bundle
    - identify only the missing documentation
-7. Write semantic outputs under `llm_outputs/` in each pending run directory.
+8. Write semantic outputs under `llm_outputs/` in each pending run directory.
    - one file per page for code claims
    - one file per page for doc claims
    - one file per page for candidate gaps
-8. Write `report.md` in each pending run directory.
+9. Write `report.md` in each pending run directory.
    - `report.md` is the docs-gap summary only.
    - Do not include refresh status, manifest-maintenance notes, selected-pages bookkeeping, or any other workflow narration in `report.md`.
    - Include operational notes only if they materially affected audit quality, such as an unrefreshable repo, missing source files, or a manifest ambiguity that changes confidence in the findings.
-9. Complete each run:
+10. Complete each run:
    - `uv run python scripts/run_audit.py complete --run-id <run_id>`
    - Completion publishes the pending directory to `artifacts/runs/<run_id>` and updates `artifacts/latest_run.json`.
    - Only completed runs with `report.md` should appear under `artifacts/runs/`.
-10. Return a concise markdown summary suitable for the Codex inbox item.
+11. Return a concise markdown summary suitable for the inbox item.
 
 ## Manifest maintenance rules