Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion workflows/docs-audit/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This workspace orchestrates a docs-gap audit across external local repos. It doe

- Deterministic scripts live in `scripts/`.
- Area manifests live in `manifests/`.
- Codex prompt templates live in `prompts/`.
- Agent prompt templates live in `prompts/`.
- Run state lives in `state/`.
- Generated run artifacts live in `artifacts/`.

Expand Down
74 changes: 55 additions & 19 deletions workflows/docs-audit/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ The goal is to find what is missing from the docs, especially conceptual and imp
This is a research workflow, not a production service. The design favors:

- compact deterministic preprocessing
- page-scoped LLM work inside the Codex app
- page-scoped LLM work by the running agent
- saved local artifacts for inspection and reuse
- simple extension through manifests

Expand All @@ -20,7 +20,7 @@ This is a research workflow, not a production service. The design favors:
This workspace does not:

- clone or vendor source code from the watched repos
- attempt to enforce a hard token quota in Codex
- attempt to enforce a hard token quota in the agent runtime
- produce doc fixes automatically
- behave like a production CI system

Expand All @@ -39,27 +39,28 @@ The audit runner only reads those repos and records refresh status. This workspa
Each weekly run follows the same sequence:

1. Refresh watched repos with safe fast-forward pulls.
2. Read the enabled area manifests.
3. Build deterministic evidence bundles for each page in the selected area.
4. Compare current evidence fingerprints to the last completed run.
5. Select pages to audit:
2. Select a bounded set of enabled area manifests for the weekly run.
3. Read the selected area manifests.
4. Build deterministic evidence bundles for each page in the selected area.
5. Compare current evidence fingerprints to the last completed run.
6. Select pages to audit:
- always include pages whose mapped evidence changed
- then include one rotating extra page for broader coverage
- if no pages changed, the rotating extra page becomes the only selected page
- the rotation walks through the pages in manifest order and advances one slot after each completed run
6. Use Codex LLM passes on the selected page bundles to extract:
- then include rotating extra pages for broader coverage
- if no pages changed, the rotating extra pages become the selected pages
- the rotation walks through the pages in manifest order and advances as rotating pages are added
7. Use page-scoped LLM passes on the selected page bundles to extract:
- code claims
- doc claims
- candidate gaps and final markdown observations
7. Save artifacts under a timestamped run directory.
8. Mark the run complete and update state.
9. Surface the final markdown report through a Codex inbox item.
8. Save artifacts under a timestamped run directory.
9. Mark the run complete and update state.
10. Surface the final markdown report through an inbox item.

## Workspace Layout

- `config.toml`: repo paths, enabled areas, selection rules, and output paths
- `manifests/`: docs-area manifests
- `prompts/`: reusable Codex prompt templates
- `prompts/`: reusable agent prompt templates
- `scripts/`: deterministic extraction, refresh, selection, and state utilities
- `state/`: lightweight run state and rotation cursor
- `artifacts/`: per-run evidence bundles, LLM outputs, and reports
Expand All @@ -72,18 +73,19 @@ The deterministic layer is responsible for the parts that should not require sem

- refreshing repos with `git pull --ff-only`
- reading manifests
- selecting changed enabled areas first, then filling the weekly area budget by rotation
- selecting source files per page
- hashing file contents and detecting changed pages
- extracting compact raw signals from docs pages and code-side surfaces
- writing page-scoped evidence bundles
- selecting changed pages plus one rotating extra page
- selecting changed pages plus rotating extra pages
- updating local state after a completed run

The deterministic layer intentionally keeps evidence compact so the LLM does not need to read entire files or repos.

## LLM-Assisted Layer

The semantic layer runs inside Codex through the automation prompt. For each selected page bundle, the LLM should:
The semantic layer runs through the automation prompt. For each selected page bundle, the LLM should:

1. infer normalized code claims from the evidence bundle
2. infer normalized doc claims from the docs bundle
Expand All @@ -102,7 +104,16 @@ The saved artifacts should include:
From this workspace root:

```bash
uv run python scripts/run_audit.py prepare --area indexing --refresh
uv run python scripts/run_audit.py select-areas --refresh --advance
```

This chooses a bounded list of enabled area manifests for the weekly run. The selector uses
`[area_selection]` in `config.toml`: changed enabled areas are considered first, then any remaining
weekly slots are filled by rotating through `enabled_areas`. Use the printed `selected_areas` list
for the per-area `prepare` commands.

```bash
uv run python scripts/run_audit.py prepare --area indexing
```

`--area` is the manifest name, not a hardcoded value in the script. The runner loads:
Expand All @@ -112,11 +123,36 @@ uv run python scripts/run_audit.py prepare --area indexing --refresh
So `--area indexing` maps to `manifests/indexing.toml`. If you add `manifests/search.toml`, you would run:

```bash
uv run python scripts/run_audit.py prepare --area search --refresh
uv run python scripts/run_audit.py prepare --area search
```

This creates a pending run directory under `artifacts/pending/<run_id>/` and prints a JSON summary to stdout.

When running after `select-areas --refresh`, omit `--refresh` from `prepare`; the repos were already
refreshed once for the weekly selection.
For a standalone one-area audit where you skip `select-areas`, pass `--refresh` to `prepare`.

## Area Selection

`enabled_areas` is the full pool of manifests the weekly automation may audit. The `[area_selection]`
block controls how many of those enabled manifests are selected for a single weekly run:

```toml
[area_selection]
mode = "changed_first_rotate"
areas_per_run = 2
```

Supported modes:

- `all`: select every enabled area.
- `rotate`: ignore changed-area detection and select only by rotating through `enabled_areas`.
- `changed_first_rotate`: select changed enabled areas first, up to `areas_per_run`, then fill any remaining slots by rotation.

The area rotation cursor is stored in `state/state.json` under `area_selection.rotation_index` when
you run `select-areas --advance`. Page-level rotation still happens independently inside each
selected area through `[selection].rotation_extra_pages`.

After the LLM phase writes the expected outputs into that pending run directory, complete the run with:

```bash
Expand Down Expand Up @@ -307,7 +343,7 @@ The runner is designed so new docs areas should generally require a new manifest

## Weekly Automation

The weekly Codex automation should use this workspace as its cwd and follow `prompts/weekly_automation.md`.
The weekly automation should use this workspace as its cwd and follow `prompts/weekly_automation.md`.

The automation should:

Expand Down
8 changes: 6 additions & 2 deletions workflows/docs-audit/config.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,14 @@ enabled_areas = [
"table-operations",
"reranking",
"embeddings",
"storage",,
"storage",
"namespaces",
]

[area_selection]
mode = "changed_first_rotate"
areas_per_run = 2

[repos.lancedb]
path = "../../../lancedb"

Expand All @@ -19,7 +23,7 @@ path = "../.."
path = "../../../sophon"

[selection]
rotation_extra_pages = 10
rotation_extra_pages = 5
prefer_changed_pages = true

[paths]
Expand Down
31 changes: 17 additions & 14 deletions workflows/docs-audit/prompts/weekly_automation.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,17 @@ This workflow also includes manifest maintenance. Before each audit run, review
- `prompts/page_audit_guidelines.md`
- `skills/area-manifest-authoring/SKILL.md`

Then read the manifest file for each area listed in `enabled_areas` in `config.toml`.
Then select the area manifests for this run using the deterministic area selector.

## Required workflow

1. Read `config.toml` and determine the enabled areas from `enabled_areas`.
2. For each enabled area, run a manifest maintenance pass before `prepare`.
2. Select the areas for this weekly run:
- `uv run python scripts/run_audit.py select-areas --refresh --advance`
- Use the printed `selected_areas` list for the rest of this workflow.
- The selector refreshes watched repos once, detects changed enabled areas, and fills the remaining weekly slots by area rotation.
- Do not run unselected enabled areas in this weekly pass.
3. For each selected area, run a manifest maintenance pass before `prepare`.
- Use `skills/area-manifest-authoring/SKILL.md` as the procedure.
- Read the current `manifests/<area>.toml`.
- Check whether the docs area boundary has changed:
Expand All @@ -34,33 +39,31 @@ Then read the manifest file for each area listed in `enabled_areas` in `config.t
- source blocks whose `applies_to` mapping is now too broad or too narrow
- Keep the manifest compact. Do not add files just because they mention the topic; add them only if they are likely to expose user-visible behavior the docs may be missing.
- If the manifest changes, save the updated `manifests/<area>.toml` before preparing the run.
3. Run the deterministic prepare step for each enabled area.
- For the first area, refresh the watched repos:
- `uv run python scripts/run_audit.py prepare --area <first-area> --refresh`
- For subsequent areas in the same weekly run, skip the refresh to avoid repeating `git pull`:
- `uv run python scripts/run_audit.py prepare --area <next-area>`
4. Read the JSON summary printed by each `prepare` command and locate each pending run directory.
4. Run the deterministic prepare step for each selected area.
- Repos were already refreshed by `select-areas`, so skip `--refresh` here:
- `uv run python scripts/run_audit.py prepare --area <area>`
5. Read the JSON summary printed by each `prepare` command and locate each pending run directory.
- Use the printed `run_dir`; it should point under `artifacts/pending/<run_id>`.
- Do not create or write directly under `artifacts/runs/<run_id>` before completion.
5. For each pending run directory, read `selected_pages.json` and the corresponding files in `page_bundles/`.
6. For each selected page bundle:
6. For each pending run directory, read `selected_pages.json` and the corresponding files in `page_bundles/`.
7. For each selected page bundle:
- apply `prompts/page_audit_guidelines.md` as the page-level review rubric
- infer normalized code claims from the evidence bundle
- infer normalized doc claims from the docs bundle
- identify only the missing documentation
7. Write semantic outputs under `llm_outputs/` in each pending run directory.
8. Write semantic outputs under `llm_outputs/` in each pending run directory.
- one file per page for code claims
- one file per page for doc claims
- one file per page for candidate gaps
8. Write `report.md` in each pending run directory.
9. Write `report.md` in each pending run directory.
- `report.md` is the docs-gap summary only.
- Do not include refresh status, manifest-maintenance notes, selected-pages bookkeeping, or any other workflow narration in `report.md`.
- Include operational notes only if they materially affected audit quality, such as an unrefreshable repo, missing source files, or a manifest ambiguity that changes confidence in the findings.
9. Complete each run:
10. Complete each run:
- `uv run python scripts/run_audit.py complete --run-id <run_id>`
- Completion publishes the pending directory to `artifacts/runs/<run_id>` and updates `artifacts/latest_run.json`.
- Only completed runs with `report.md` should appear under `artifacts/runs/`.
10. Return a concise markdown summary suitable for the Codex inbox item.
11. Return a concise markdown summary suitable for the inbox item.

## Manifest maintenance rules

Expand Down
Loading
Loading