Skip to content
Merged
1 change: 1 addition & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,7 @@ Full rationale in `docs/DESIGN.md`.
- **Cycle detection covers both chains**: `CircularImportError` for the import chain, `CircularImplementationError` for the implementation chain.
- **FK constraints scope evidence from implementation children**: SVCs/MVRs/annotations referencing out-of-scope requirements are rejected by SQLite FK checks on insert — no explicit filtering needed.
- **Test results need explicit scoping**: no FK (keyed by FQN), so a scope check is required when inserting test results from implementation children.
- **The requirement "complete" verdict has ONE source of truth**: `StatisticsService`'s per-requirement computation (producing `RequirementStatus`). Every consumer — `status`/`report`/`export`, the MCP tools (`get_status`, `get_requirement_status`, `get_requirements_status`), and LSP — MUST derive completeness from that same computation. Do NOT re-implement "is this requirement met / are its automated tests satisfied" anywhere else (e.g. a parallel helper in `common/queries/details.py`). Two parallel verdict paths silently drifted and caused bug #411 (consolidation tracked in #412); a private re-derivation is the regression to guard against. When the verdict logic must be reused, extract/call the shared predicate — never copy its traversal.

## Key Conventions

Expand Down
18 changes: 11 additions & 7 deletions docs/modules/ROOT/pages/mcp.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -168,13 +168,16 @@ Overall traceability status across all requirements — completion counts, test

==== `get_requirement_status`

Quick status for a single requirement.
Status for a single requirement, derived from the same verdict computation as the `status` CLI
command — `get_status`, `get_requirement_status`, and `get_requirements_status` always agree.

*Parameters:*

* `id` _(string, required)_
* `include_post_build` _(boolean, optional, default `false`)_ — also scope to post-build-phase
SVCs, for parity with `status --with-post-tests`

*Returns:* `{ id, lifecycle_state, implementation, test_summary: {passed, failed, skipped, missing}, meets_requirements }`
*Returns:* `{ id, lifecycle_state, completed, implementations, implementation_type, automated_tests: {total, passed, failed, skipped, missing, not_applicable}, manual_tests: {total, passed, failed, skipped, missing, not_applicable} }`

==== `get_requirements_status`

Expand All @@ -185,13 +188,14 @@ requirements without N+1 individual calls.
*Parameters:*

* `urn` _(string, optional)_ — scope to a single project node
* `include_post_build` _(boolean, optional, default `false`)_ — same as `get_requirement_status`

*Returns:* array of `{ id, urn, lifecycle_state, implementation, test_summary, meets_requirements }`
*Returns:* array of `{ id, urn, lifecycle_state, completed, implementations, implementation_type, automated_tests, manual_tests }`

== Example: Finding Incomplete Requirements

The following client-side filter finds requirements that have started but are not yet done — they
have an implementation and at least one passing test, but `meets_requirements` is still false
have an implementation and at least one passing automated test, but are still not `completed`
(e.g. due to missing or failing tests for other SVCs):

[source,python]
Expand All @@ -200,9 +204,9 @@ statuses = get_requirements_status()

in_progress = [
r for r in statuses
if not r["meets_requirements"]
and r["implementation"] != "not_implemented"
and r["test_summary"]["passed"] > 0
if not r["completed"]
and r["implementation_type"] != "N/A"
and r["automated_tests"]["passed"] > 0
]
----

Expand Down
6 changes: 6 additions & 0 deletions docs/reqstool/requirements.yml
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,12 @@ requirements:
description: The system shall report a clear, actionable error when the optional dependencies required for the MCP server are not installed.
categories: ["reliability"]
revision: "0.11.0"
- id: MCP_0005
title: MCP status tool verdict/shape consistency
significance: shall
description: The system shall report an identical completion verdict and output structure for a given requirement across the status CLI command and the MCP get_requirement_status and get_requirements_status tools, in both build-only and post-build scoping modes, derived from a single shared verdict computation and a single shared serializer.
categories: ["functional-suitability"]
revision: "0.11.0"

# --- data-sources capability (derived from openspec/specs/data-sources) ---
- id: SOURCE_0001
Expand Down
6 changes: 6 additions & 0 deletions docs/reqstool/software_verification_cases.yml
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,12 @@ cases:
description: "GIVEN the MCP dependencies are not installed WHEN the command runs THEN it reports how to install them and does not start the server"
verification: automated-test
revision: "0.11.0"
- id: SVC_MCP_0005
requirement_ids: ["MCP_0005"]
title: "MCP status tools agree with status command"
description: "GIVEN the same dataset WHEN get_requirement_status and get_requirements_status are called THEN they report the same completed verdict and output structure as the status command for the same requirement, in both build-only and post-build modes"
verification: automated-test
revision: "0.11.0"

# --- data-sources ---
- id: SVC_SOURCE_0001
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-06-21
112 changes: 112 additions & 0 deletions openspec/changes/consolidate-requirement-verdict/design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
## Context

Two code paths compute whether a requirement is "complete":

- `StatisticsService._calculate_requirement_stats` (`src/reqstool/services/statistics_service.py`)
produces a `RequirementStatus` (with `TestStats` for automated and manual evidence) and is the
authoritative path behind `status`, `report`, `export`, and the MCP `get_status` tool.
- `common/queries/details.py` (`_compute_meets` + `_build_automated_test_summary`) re-derives the
same verdict via a different traversal and powers the MCP `get_requirement_status` and
`get_requirements_status` tools.

The two are not equivalent, even after #411's patch:

| Aspect | `StatisticsService` | `details.py` |
| --- | --- | --- |
| SVC phase scoping | build-phase only unless `include_post_build` | all SVCs, no phase filter |
| "applies at all" gate | `not_applicable` when no SVC expects automated/MVR evidence | always requires all-passing |
| automated test source | walks test annotations, matches by FQN, marks MISSING per annotation | reads `get_test_results_for_svc` directly |
| no-qualifying-SVC | `completed=False` unless some SVC expects evidence | only requires a non-empty SVC list |

`get_status` already uses `StatisticsService` directly, so the inconsistency is isolated to the
two per-requirement/scoped MCP tools. The constraint that matters: there is no backwards-compat
requirement on MCP tool output, so the fix can change both the schema and the verdict values.

The goal is stronger than "make the two agree": there must be **one** place that computes the
"complete" verdict and **one** place that serializes it, so that the CLI (`status`), the MCP
tools, and any future LSP completion display all return identical results for identical input.

## Goals / Non-Goals

**Goals:**
- One per-requirement verdict computation, called by both `StatisticsService` and `details.py`.
- One per-requirement serializer, called by both `StatisticsService.to_status_dict()` and the MCP
status tools — same verdict *and* same output shape, not two shapes that happen to match.
- `status` (CLI), the MCP status tools, and any future LSP completion display report the same
`completed` verdict and output structure for the same input, in **both** build-only and
`--with-post-tests` (post-build) modes.
- Delete `_compute_meets` and `_build_automated_test_summary`.

**Non-Goals:**
- Changing the `status`, `report`, or `export` command behavior or output.
- Changing `get_status` (already unified).
- Adding new MCP tools or changing transports / dataset resolution.
- Implementing an LSP completion display now (none exists yet) — see the forward-constraint below.

## Decisions

### Extract a per-requirement predicate that queries the repository directly

Introduce a single function `compute_requirement_status(req, repo, *, include_post_build) ->
RequirementStatus` that computes the verdict for one requirement by **querying the repository**
through its scoped, index-backed per-requirement getters (`get_svcs_for_req`,
`get_annotations_impls_for_req`, `get_annotations_tests_for_svc`, `get_test_results_for_svc`,
`get_effective_mvr_for_svc`). `StatisticsService._calculate_requirement_stats` calls it inside its
loop; `details.py` calls it per id. `StatisticsService` keeps owning global aggregation
(`_calculate_global_totals`) and totals accumulation (`_update_requirement_totals`), fed by the
`RequirementStatus` the predicate returns.

- **Why query the repo rather than thread a pre-fetched data bundle**: the repository layer exists
precisely so business logic asks the database for what it needs. Passing the four bulk tables
(`get_all_svcs`, `get_annotations_impls`, `get_annotations_tests`, `get_automated_test_results`)
into the predicate would leak the repo's job onto every caller and couple the signature to
`StatisticsService`'s fetch strategy.
- **Why this is not a perf regression**: the per-req getters are backed by primary keys and FK
indexes (`schema.py`) on an in-memory SQLite database; total work across a `status` run is
comparable to today's four bulk `SELECT *` calls. If query volume ever matters at scale, the fix
is repository-level caching — a separate concern, not a reason to complicate this signature.
- **Trade-off**: requires untangling `_calculate_requirement_stats` from
`_update_requirement_totals`, which today run in the same pass.

### One shared per-requirement serializer

Extract `_requirement_to_dict(status: RequirementStatus) -> dict` (the per-requirement body of
`StatisticsService.to_status_dict()`, producing `completed`, `implementation_type`,
`automated_tests`/`manual_tests` with `total` and `not_applicable`). `to_status_dict()` calls it,
and the `details.py` MCP status functions call it. No mapping back to the old `meets_requirements`
/ flat `test_summary` dict.

- **Why**: a unified verdict that is serialized two different ways still diverges from the
consumer's point of view — same value, different JSON — which is how the original drift began.
Sharing the serializer keeps all status surfaces on one schema by construction, not by
coincidence. Backwards compatibility is not required (project decision), so preserving the legacy
dict would only perpetuate a second shape.

### Expose the post-build scoping flag on every status surface

The predicate's `include_post_build` flag is plumbed through to the MCP tools as an optional
parameter (default `False`, matching the `status` default) so the MCP tools have parity with
`status --with-post-tests`. The same parameter is the contract for any future LSP completion
display. This closes the last verdict-divergence gap: CLI = MCP = LSP in **both** modes, not just
the default.

### LSP is a forward-constraint, not implemented here

There is no LSP completion display today (no verdict code in `src/reqstool/lsp/`). This change does
not add one. It does bind the future: when LSP gains a completion display it MUST call
`compute_requirement_status` + `_requirement_to_dict` (with the same `include_post_build`
contract) and MUST NOT re-derive the verdict. `MCP_0005` is worded to cover all status surfaces so
the third consumer cannot silently fork later. This mirrors the convention note in `CLAUDE.md`.

## Risks / Trade-offs

- **Untangling totals from per-req stats introduces regressions in `status`/`report`/`export`** →
Mitigation: keep totals accumulation in `StatisticsService`; the extracted predicate returns the
same `RequirementStatus` the loop already builds, so totals read identical inputs. Guard with the
existing statistics unit tests and the CLAUDE.md regression smoke diffs (must be byte-identical for
`status`/`report`).
- **MCP verdict/shape change surprises consumers** → Mitigation: documented as intentional BREAKING
in the proposal; the new values are the correct, `status`-consistent ones.
- **Hidden behavioral difference between the two old paths becomes visible** → Mitigation: add MCP
tests asserting `get_requirement_status` / `get_requirements_status` agree with
`StatisticsService` on the same fixtures (including the `REQ_ext002_300` divergence case).
61 changes: 61 additions & 0 deletions openspec/changes/consolidate-requirement-verdict/proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
## Why

The "is this requirement complete?" verdict is computed by two independent code paths that
have already drifted once: `StatisticsService` (powering `status`/`report`/`export` and the
MCP `get_status` tool) and a separate set of helpers in `common/queries/details.py`
(`_compute_meets`, `_build_automated_test_summary`) powering the MCP `get_requirement_status`
and `get_requirements_status` tools. Issue #411 patched a symptom of that drift; this change
removes the second implementation so the two can never disagree again.

## What Changes

- Extract a single per-requirement verdict computation
(`compute_requirement_status(req, repo, *, include_post_build)`) that produces one
`RequirementStatus` value, used by both `StatisticsService` (in its per-requirement loop) and
the `details.py` MCP status functions. The predicate queries the repository through its scoped
per-req getters rather than reusing bulk fetches, keeping the repository layer as the data
boundary.
- Extract a single per-requirement serializer (`_requirement_to_dict`) called by both
`StatisticsService.to_status_dict()` and the MCP status functions, so all status surfaces share
one output shape by construction (not two shapes that happen to match).
- Delete the parallel predicate from `details.py` (`_compute_meets`,
`_build_automated_test_summary`).
- Expose `include_post_build` (default `False`) on the MCP `get_requirement_status` /
`get_requirements_status` tools, for parity with `status --with-post-tests`, so CLI and MCP
agree in both build-only and post-build modes.
- **Forward-constraint (LSP):** no LSP completion display exists today and none is added here, but
any future one MUST consume `compute_requirement_status` + `_requirement_to_dict` and MUST NOT
re-derive the verdict. `MCP_0005` is worded to cover all status surfaces (CLI, MCP, LSP).
- **BREAKING (MCP output):** the MCP `get_requirement_status` and `get_requirements_status`
tools emit the unified status shape directly (`completed`, `implementation_type`,
`automated_tests`/`manual_tests` objects with `total` and `not_applicable`), replacing the
old `meets_requirements` / flat `test_summary` shape. Backwards compatibility for MCP client
output is explicitly not required.
- **BREAKING (verdict values):** routing the MCP tools through the real predicate changes
reported verdicts for requirements with post-build-phase-only SVCs, "not applicable" cases,
and test annotations without recorded executions — these now match the `status` command
exactly. The MCP tools default to build-phase-only scoping to match the `status` default.

## Capabilities

### New Capabilities
<!-- none -->

### Modified Capabilities
- `mcp`: adds a requirement that all per-requirement status surfaces (the `status` CLI, the MCP
status tools, and any future LSP completion display) report an identical completion verdict and
output structure for the same input, derived from a single shared verdict computation and a
single shared serializer, in both build-only and post-build scoping modes.

## Impact

- `src/reqstool/common/queries/details.py` — removes the duplicate predicate; status functions
delegate to the shared computation.
- `src/reqstool/services/statistics_service.py` — per-requirement verdict logic extracted so it
can be shared (totals accumulation stays here).
- `src/reqstool/mcp/server.py` — `get_requirement_status` / `get_requirements_status` tool
output shape changes.
- reqstool SSOT (`docs/reqstool/requirements.yml`, `software_verification_cases.yml`) — new
`MCP_0005` requirement and `SVC_MCP_0005`.
- MCP clients consuming `get_requirement_status` / `get_requirements_status` — output schema and
some verdict values change (intentional).
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
## ADDED Requirements

### Requirement: MCP_0005
The system SHALL implement MCP_0005.

#### Scenario: SVC_MCP_0005
The system SHALL pass SVC_MCP_0005.
32 changes: 32 additions & 0 deletions openspec/changes/consolidate-requirement-verdict/tasks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
## 1. reqstool SSOT

- [x] 1.1 Add `MCP_0005` (per-requirement status tool verdict/shape consistency) to `docs/reqstool/requirements.yml` under the mcp capability block
- [x] 1.2 Add `SVC_MCP_0005` (verifies `get_requirement_status`/`get_requirements_status` agree with `status`) to `docs/reqstool/software_verification_cases.yml`
- [x] 1.3 Run `openspec validate consolidate-requirement-verdict --type change --strict` and confirm it passes

## 2. Extract the shared verdict predicate

- [x] 2.1 Add `compute_requirement_status(req, repo, *, include_post_build) -> RequirementStatus` that encapsulates the per-requirement implementation/automated/manual verdict logic currently in `StatisticsService._calculate_requirement_stats`. It MUST obtain its data by querying the repository through the scoped per-req getters (`get_svcs_for_req`, `get_annotations_impls_for_req`, `get_annotations_tests_for_svc`, `get_test_results_for_svc`, `get_effective_mvr_for_svc`) — do NOT thread pre-fetched bulk tables through the signature
- [x] 2.2 Refactor `StatisticsService._calculate_requirement_stats` to call the extracted predicate, keeping `_calculate_global_totals` and `_update_requirement_totals` (global aggregation + totals accumulation) in `StatisticsService`
- [x] 2.3 Verify `status`/`report`/`export` output is unchanged (statistics unit tests pass; CLAUDE.md regression smoke diffs are byte-identical)

## 3. Share the serializer and route MCP status tools through the predicate

- [x] 3.1 Extract `_requirement_to_dict(status: RequirementStatus) -> dict` (the per-requirement body of `to_status_dict()`: `completed`, `implementation_type`, `automated_tests`/`manual_tests` with `total` and `not_applicable`) and make `to_status_dict()` call it
- [x] 3.2 Rewrite `get_requirement_status` / `get_requirements_status_all` in `details.py` to call `compute_requirement_status` and serialize via the shared `_requirement_to_dict` — same code, not a re-implemented matching shape
- [x] 3.3 Expose `include_post_build` (default `False`) as an optional parameter on the MCP `get_requirement_status` / `get_requirements_status` tools, for parity with `status --with-post-tests`
- [x] 3.4 Delete `_compute_meets` and `_build_automated_test_summary` from `details.py`
- [x] 3.5 Add `@Requirements("MCP_0005")` to the implementing function(s) for the consolidated MCP status path

## 4. Tests

- [x] 4.1 Add a test asserting `get_requirement_status` / `get_requirements_status` agree with `StatisticsService` per requirement on the `test_standard/baseline/ms-001` fixture, covering the previously divergent `REQ_ext002_300`
- [x] 4.2 Assert agreement in **both** modes: `include_post_build=False` (default) and `True` (parity with `status --with-post-tests`)
- [x] 4.3 Add `@SVCs("SVC_MCP_0005")` to the test method from 4.1
- [x] 4.4 Update any existing tests asserting the old `meets_requirements` / flat `test_summary` MCP shape to the new shape

## 5. Verification

- [x] 5.1 Run `hatch run dev:pytest --cov=reqstool` and `hatch run dev:flake8`
- [x] 5.2 Run `reqstool status local -p docs/reqstool` (via `hatch run python src/reqstool/command.py`) and confirm all requirements complete with `SVC_MCP_0005` covered
- [x] 5.3 Run `openspec validate --all --strict`
20 changes: 20 additions & 0 deletions openspec/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
schema: spec-driven

# Project context (optional)
# This is shown to AI when creating artifacts.
# Add your tech stack, conventions, style guides, domain knowledge, etc.
# Example:
# context: |
# Tech stack: TypeScript, React, Node.js
# We use conventional commits
# Domain: e-commerce platform

# Per-artifact rules (optional)
# Add custom rules for specific artifacts.
# Example:
# rules:
# proposal:
# - Keep proposals under 500 words
# - Always include a "Non-goals" section
# tasks:
# - Break tasks into chunks of max 2 hours
Loading