docs(state): finalize feat_studies_list_trial_convergence_columns (PR #438)#439
Conversation
…438) Prepend the #438 entry to Last 5 merges (noting it restored the lost PR #421 Story 1.2 columns + corrected the doc-drift); drop the now-6th entry to the older-entries pointer; update branch context + in-flight to reflect all three 2026-06-03 features merged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
There was a problem hiding this comment.
Code Review
This pull request updates state.md to reflect the merge of PR #438 (feat_studies_list_trial_convergence_columns), which restores lost work on the studies list trials and convergence columns. It updates the active branch context, the list of recent merges, and the in-flight status. The reviewer pointed out a duplication issue where PR #426 (bug_llm_capability_cache_no_refresh) is listed both in the detailed recent merges and at the beginning of the older entries list, providing a suggestion to remove it from the older entries.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| - **2026-06-02** — `bug_llm_capability_cache_no_refresh` (PR #426, squash-merged `432dcf59`). The OpenAI capability check ran exactly once at api startup (`main.py:94`, fire-and-forget lifespan task) + cached in Redis with a 24h TTL (`capability_check.py:48`); nothing repopulated it, so any stack up >24h silently lost all LLM-dependent capability — `POST /judgments/generate` returned `503 LLM_PROVIDER_INCAPABLE "cache miss"` until an api restart. Confirmed live at 34h uptime (zero `openai:capabilities:*` keys; `docker compose restart api` fixed it). **Fix (Option A, locked at preflight D-1):** new `read_or_recompute_capability_result()` helper reads the cache, recomputes inline via `check_capabilities()` on miss (writes back), returns `None` on empty key (preserves the `/healthz` "no key" semantic). `agent_judgments_dispatch._check_llm_preflight` opts in; `/healthz` (200ms SLO, Rule #11) + chat orchestrator stay read-only (D-5). A per-worker `asyncio.Lock` single-flight + in-lock double-checked read collapses concurrent in-worker recompute bursts to 1 probe (D-4, refined after GPT-5.5 caught the original "WEB_CONCURRENCY × probes" bound undercounting concurrent requests); defensive try/except returns `None` on unexpected failure (→ caller's existing 503 envelope, not a bare 500). Options B (background refresh) + C (stale-but-usable) rejected (D-2/D-3). Shipped via `/bug-fix --ship` → `/impl-execute --ad-hoc`. No `backend/app/` source beyond the helper + call-site swap, no migration (head stays `0022`). 7 unit tests (`TestReadOrRecomputeCapabilityResult`) + 1 integration test (`test_generate_recovers_after_capability_cache_expiry`); test-fixture monkeypatch sites updated to the new symbol. 2194 unit pass, 330 contract pass. Cross-model: Gemini 4 (1 accepted — `api_key: str | None`; 3 rejected as hunk-isolated false positives on `AsyncMock.assert_not_awaited`, stdlib since 3.8), GPT-5.5 final 2 (both accepted — the asyncio.Lock single-flight + the exception wrapper, each with a new regression test). Ride-along: `/idea-preflight` SKILL.md routing fix (no longer hard-codes `/pipeline --auto` — routes to `/bug-fix`/`/impl-execute --ad-hoc` by prefix+scope). All 12 `pr.yml` checks green. | ||
| - **2026-06-02** — `infra_smoke_reseed_runtime_budget` (PR #424, squash-merged `035d7941`). Clears the last of the three-PR Solr-CI debt chain (`infra_solr_ci_readiness` backend half → `infra_solr_smoke_stability` Solr boot → this, the reseed-runtime half). The smoke job's `demo-ubi.spec.ts` `beforeAll` reseed exceeded the 25-min `timeout-minutes` cap once Solr actually booted (AC-8 of `feat_demo_ubi_study_comparison` bounds the in-flight reseed at 1140s/~19 min hard ceiling, ~28 min worst case per §14 — Playwright + setup overhead pushed total past 25 min; PR #383 run 26790636716 hit it at 25:18). **Fix (Option A, locked at idea-preflight):** extend `ui/playwright.config.ts`'s `testIgnore` CI-gated branch by one entry (`'**/demo-ubi.spec.ts'`, the 7th alongside the 6 pre-existing demo-data-dependent specs) — the `process.env.CI ? [...] : []` ternary gates it to GHA runs, so local `make up` smoke (`CI=` unset) keeps full demo-ubi coverage. Option B (timeout bump → 35 min) rejected (D-3: <7 min margin against §14 worst case); Option C (env-var reseed scenario filter, ~2-3h multi-file) deferred per operator (D-2). New vitest regression guard `ui/src/__tests__/playwright-config-test-ignore.test.ts` (3 assertions: demo-ubi in CI branch, all 7 entries present, demo-ubi not outside the ternary). Runbook `docs/03_runbooks/smoke-solr-stability.md` §5 documents the exclusion + the reseed-runtime-vs-Solr-stability split; pr.yml + state.md stale "exceeds the cap" framing refreshed to "runtime block cleared, flip `SMOKE_TEST=true` after the §16 `playwright test --list` verification". 5 stories / 1 epic. No `backend/app/` source, no migration (head stays `0022`). §16 manual verification confirmed AC-1 (`CI=true` → 86 tests/30 files, 0 demo-ubi) + AC-2 (`CI=` unset → 110 tests/37 files, demo-ubi discovered). Cross-model: spec GPT-5.5 3 cycles (13 findings, all applied), plan GPT-5.5 3 cycles (11 findings, all applied), Gemini 2 (both accepted — `import.meta.url` path resolution + CRLF normalization), GPT-5.5 final 3 (2 accepted: §4→§5 pointer + runbook markdown links; 1 rejected: AC-7 file-shape re-raise, counter-evidence cited). All 12 `pr.yml` checks green. | ||
| _(older entries — full narrative in [`state_history.md`](state_history.md): `feat_studies_convergence_visibility` PR #421/#422, `bug/cli-seed-ubi-missing-engine-type` PR #419, `chore_template_library_expansion` PR #416, `infra_smoke_reseed_runtime_budget` PR #424, `infra_solr_smoke_stability` PR #383, `infra_solr_ci_readiness` Phase 1 PR #367, MVP2 backlog batch PR #364, `feat_study_convergence_indicator` PR #352, `feat_overnight_autopilot` PR #343, `infra_adapter_solr` PR #336, …)_ | ||
| _(older entries — full narrative in [`state_history.md`](state_history.md): `bug_llm_capability_cache_no_refresh` PR #426, `infra_smoke_reseed_runtime_budget` PR #424, `feat_studies_convergence_visibility` PR #421/#422, `bug/cli-seed-ubi-missing-engine-type` PR #419, `chore_template_library_expansion` PR #416, `infra_solr_smoke_stability` PR #383, `infra_solr_ci_readiness` Phase 1 PR #367, MVP2 backlog batch PR #364, `feat_study_convergence_indicator` PR #352, `feat_overnight_autopilot` PR #343, `infra_adapter_solr` PR #336, …)_ |
There was a problem hiding this comment.
The entry bug_llm_capability_cache_no_refresh (PR #426) is duplicated. It is currently listed both in the "Last 5 merges" list (line 33) and at the beginning of the "older entries" list (line 34). Since it is still one of the last 5 merges, it should be removed from the older entries list. Only infra_smoke_reseed_runtime_budget (PR #424) should be prepended to the older entries list.
| _(older entries — full narrative in [`state_history.md`](state_history.md): `bug_llm_capability_cache_no_refresh` PR #426, `infra_smoke_reseed_runtime_budget` PR #424, `feat_studies_convergence_visibility` PR #421/#422, `bug/cli-seed-ubi-missing-engine-type` PR #419, `chore_template_library_expansion` PR #416, `infra_solr_smoke_stability` PR #383, `infra_solr_ci_readiness` Phase 1 PR #367, MVP2 backlog batch PR #364, `feat_study_convergence_indicator` PR #352, `feat_overnight_autopilot` PR #343, `infra_adapter_solr` PR #336, …)_ | |
| _(older entries — full narrative in [`state_history.md`](state_history.md): `infra_smoke_reseed_runtime_budget` PR #424, `feat_studies_convergence_visibility` PR #421/#422, `bug/cli-seed-ubi-missing-engine-type` PR #419, `chore_template_library_expansion` PR #416, `infra_solr_smoke_stability` PR #383, `infra_solr_ci_readiness` Phase 1 PR #367, MVP2 backlog batch PR #364, `feat_study_convergence_indicator` PR #352, `feat_overnight_autopilot` PR #343, `infra_adapter_solr` PR #336, …)_ |
Summary
Finalization for #438 (
feat_studies_list_trial_convergence_columns, squash-merged03976c5e). Docs-only —state.md"Last 5 merges" + branch/in-flight context.The #438 work itself already carried its doc corrections (the
state_history.md+ implemented-planCORRECTIONannotations about the lost PR #421 columns). This just rolls the merge into the one-page snapshot.Test plan
state.mdonly; under the 60KB gate (23.5KB)