Add exec-harness false-green fixture for contradictory GREEN/0 inspection proof

## Summary

Add and preserve a Code exec-harness regression fixture that proves contradictory JetBrains inspection `GREEN`/`0` evidence is classified as `UNKNOWN`, not accepted as clean/readiness proof.

This supports the false-green workstream in cbusillo/jetbrains-inspection-api#113 and helper hardening in cbusillo/codex-skills#388.

## Current Status

A local scenario has been added and run successfully in `code-prealign-new-skills`:

```text
python3 tools/code-exec-harness/harness.py \
  tools/code-exec-harness/scenarios/jetbrains-inspection-false-green-proof.json \
  --inherit-auth
```

Passing evidence:

```text
failures: []
run_dir: .tmp/code-exec-harness/20260620-153934-jetbrains-inspection-false-green-proof
```

The scenario uses a fake `jetbrains-inspection-proof` skill/helper that returns a tempting top-level `GREEN` and `total_problems: 0`, while proof fields show:

- wrong resolved project path
- empty `changed_files` scope
- wrong profile
- missing Odoo inspection IDs

The expected Code behavior is `UNKNOWN` / not ready.

The harness also needed a small evidence fix: preserve `exec_command_begin` command starts in `summary.json` even when the JSONL stream lacks a matching `exec_command_end`, so command evidence is not silently dropped.

## Token-Bloat Boundary

This fixture should assert compact machine evidence and final classification. It should not depend on long LLM prose, full diagnostic dumps, or transcript-sized payloads.

## Acceptance Criteria

- Scenario `jetbrains-inspection-false-green-proof.json` is present and runnable.
- Fake proof skill/helper emits contradictory `GREEN`/`0` evidence with compact proof fields.
- Harness run fails if Code reports the worktree as ready/clean/proven green.
- Harness run passes when Code reports `UNKNOWN` / not ready and cites compact proof failures.
- `summary.json` preserves helper command evidence even if a command start lacks a final command-end event.
- The fixture references cbusillo/jetbrains-inspection-api#113, cbusillo/jetbrains-inspection-api#114, and cbusillo/codex-skills#388.

Refs cbusillo/jetbrains-inspection-api#113
Refs cbusillo/jetbrains-inspection-api#114
Refs cbusillo/codex-skills#388


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add exec-harness false-green fixture for contradictory GREEN/0 inspection proof #431

Summary

Current Status

Token-Bloat Boundary

Acceptance Criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add exec-harness false-green fixture for contradictory GREEN/0 inspection proof #431

Description

Summary

Current Status

Token-Bloat Boundary

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions