Skip to content

[Bugfix] Auto-refresh quest_root SUMMARY.md on every artifact.record#82

Open
droidlyx wants to merge 1 commit intoResearAI:mainfrom
droidlyx:fix_auto_refresh_summary_on_record
Open

[Bugfix] Auto-refresh quest_root SUMMARY.md on every artifact.record#82
droidlyx wants to merge 1 commit intoResearAI:mainfrom
droidlyx:fix_auto_refresh_summary_on_record

Conversation

@droidlyx
Copy link
Copy Markdown
Contributor

@droidlyx droidlyx commented May 4, 2026

[Bugfix] Auto-refresh quest_root SUMMARY.md on every artifact.record

What changed

A single change in src/deepscientist/artifact/service.py:

# Change Effect
1 refresh_summary gains a record_artifact: bool = True parameter When False, the SUMMARY.md files are written but no summary_refresh audit artifact is recorded. Default True preserves existing behavior for explicit callers.
2 record(...) calls refresh_summary(record_artifact=False) automatically after every successful record (with a recursion guard for the summary_refresh report itself) quest_root/SUMMARY.md now stays current as a side effect of normal artifact writes. Agents no longer need to remember to call refresh_summary for the snapshot to reflect reality.

Why

SUMMARY.md at the quest root is the canonical compact quest snapshot — the file an external observer (operator inspecting progress without parsing thousands of events.jsonl entries) or a cross-quest agent (bash_exec ls -t ~/DeepScientist/quests/*/SUMMARY.md under shared memory mode) reads to see what a quest is doing. To be useful it has to be current.

The intended update path was an idea / experiment / analysis-campaign / write / review / decision / finalize skill calling artifact.refresh_summary(...) at meaningful checkpoints. There are 5 such prose nudges in src/prompts/system.md.

In practice, agents almost never call it. Across four observed quests on the same studio:

Quest refresh_summary calls in 9-24h of active work quest_root SUMMARY.md state
015 0 default ("No completed milestones yet.")
016 1 default (write went to worktree, not quest_root — fixed by upstream refresh_summary mirror PR)
017 1 actual state (single snapshot, taken near end of quest)
018 0 default

Agents have richer state in events.jsonl, mcp__artifact__get_quest_state(detail=summary), memory cards, and their own context window, so the SUMMARY.md write is never on their critical path. The artifact reaches its intended audience (external observers and cross-quest browsers) only when an agent happens to remember.

The structural fix is to stop relying on agents to maintain a side-file and let the daemon do it as a side effect of artifact writes.

After this change:

  • Every successful artifact.record(...) updates quest_root/SUMMARY.md automatically. No agent action.
  • events.jsonl is not polluted with summary_refresh reports for each auto-refresh — only the file is touched.
  • The auto-refresh only writes to quest_root/SUMMARY.md, not to the active worktree's SUMMARY.md. This preserves the worktree's clean working-tree state (a worktree-side write without commit would block subsequent git switch / git worktree operations during normal artifact flows).
  • Explicit artifact.refresh_summary() calls still record the audit artifact and still write both worktree + quest_root SUMMARY.md (default record_artifact=True), preserving the existing surface for callers who want the audit trail and the branch-versioned summary.

Recursion guard

refresh_summary(record_artifact=True) calls record(...) to log a summary_refresh report. The auto-hook in record(...) would then call refresh_summary again — infinite loop.

Guarded explicitly:

if not (record["kind"] == "report" and record.get("report_type") == "summary_refresh"):
    try:
        self.refresh_summary(quest_root, reason=f"auto after {record['kind']} {artifact_id}", record_artifact=False)
    except Exception:
        pass

The record_artifact=False path also avoids the recursion entirely — the auto-hook never records, so even without the kind guard, depth is bounded at one. The kind guard makes the bound explicit and survives future refactors.

Performance

refresh_summary reads up to 20 recent artifact JSONs and writes one markdown file. ~10ms per call locally; for a typical 70-100-record quest that's <1 second of cumulative overhead. The auto-hook is wrapped in try / except so a refresh failure cannot abort an artifact record.

Tests

Two new tests in tests/test_memory_and_artifact.py:

  • test_record_auto_refreshes_quest_root_summary_without_extra_artifact — records a report artifact, asserts quest_root/SUMMARY.md is updated to mention the record (auto after report <id>), and asserts no summary_refresh audit artifact was emitted as a side effect.
  • test_record_skips_auto_refresh_when_record_is_summary_refresh — calls refresh_summary() explicitly (default record_artifact=True), asserts exactly one summary_refresh audit artifact exists (no recursion-driven duplication).

Existing refresh_summary tests still pass:

  • test_refresh_summary_mirrors_to_quest_root_when_active_workspace_is_worktree
  • test_refresh_summary_writes_once_when_workspace_equals_quest_root
  • test_activate_branch_preserves_head_and_redirects_main_run (regression caught and fixed during development: an earlier draft wrote SUMMARY.md inside the worktree without committing, leaving the working tree dirty and blocking git switch during activate_branch. The merged design only writes to quest_root in auto-refresh mode.)

Local sweep:

pytest tests/test_memory_and_artifact.py tests/test_mcp_servers.py

Outcome on this branch: same pre-existing failures as origin/main at the same commit, no new failures introduced (4 pre-existing failures unrelated to this PR remain on main).

Documentation

No docs change. refresh_summary's contract (it refreshes the quest's compact state) is unchanged; this PR only changes when it fires and whether it records an audit trail.

The 5 artifact.refresh_summary(...) prose nudges in src/prompts/system.md (experiment / analysis-campaign / write / review / finalize) become redundant after this PR — they were the prose half of the same goal, now subsumed by the server-side auto-hook. Removing those prose lines is a small follow-up cleanup and out of scope here.

Compatibility / migration

  • refresh_summary's signature gains a keyword-only record_artifact parameter, defaulting to True. All existing callers (the MCP tool, tests, manual invocations) preserve their previous behavior without changes.
  • record(...)'s return shape and side effects are unchanged. The added side effect is a single file write per record (idempotent — the same content is reproducible from the recorded artifacts).
  • Independent of the read_visibility_mode cross-quest gating recently merged: SUMMARY.md maintenance happens regardless of whether agents are allowed to read sibling quests' SUMMARY.md across quest boundaries. The two changes are orthogonal.
  • No new MCP namespace, no new artifact kind, no schema migration, no config flag.

AI assistance disclosure

Prepared with AI assistance. Each change was authored, reviewed line by line, and verified end-to-end via pytest tests/test_memory_and_artifact.py tests/test_mcp_servers.py before submission. No commit is unreviewed AI output.

`SUMMARY.md` at the quest root is the canonical compact snapshot for
external observers (operators inspecting quest progress without
parsing events.jsonl, cross-quest agents browsing prior quests). To
be useful it has to be current.

The intended path was `refresh_summary` called from stage skill
prose at meaningful checkpoints. In practice agents almost never
call it: across 4 observed quests on the same studio, the count was
0 / 1 / 1 / 0 over 9-24h of active work. Agents have richer state in
events.jsonl, `get_quest_state`, and memory cards, so the SUMMARY
write is never on their critical path.

Stop relying on agents to maintain a side-file. After this change,
every successful `record(...)` writes a fresh SUMMARY.md to
`quest_root` as a side effect (recursion-guarded against the
`summary_refresh` report itself). The auto-refresh:

- only writes `quest_root/SUMMARY.md` (NOT the worktree's), so it
  cannot leave a worktree dirty and block subsequent
  `git switch` / `git worktree` operations during normal artifact
  flows;
- does not record a `summary_refresh` audit artifact (would
  pollute events.jsonl with one auto-refresh report per record).
  Explicit `refresh_summary()` calls still record the audit
  artifact (default `record_artifact=True`) and still write both
  worktree + quest_root SUMMARY.md, preserving the existing
  contract.

Wraps in try/except so a refresh failure cannot abort the record.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant