feat: add verification fixtures and graph.json schema validator#4
Open
Ujikintoki wants to merge 4 commits into
Open
feat: add verification fixtures and graph.json schema validator#4Ujikintoki wants to merge 4 commits into
Ujikintoki wants to merge 4 commits into
Conversation
Contributor
Author
|
This is a pure python scripts to verify harness of the reafiner using mock llm outputs. |
Contributor
Author
|
Still working... |
Add synthetic test data and tools for verifying the Reafiner harness (Level 0 trace validation + Level 1.5 evidence audit) without requiring LLM, graphify, or any agent platform. - fixtures/synth-minimal.json: 10-node KG conforming to graphify schema - fixtures/trace-valid*.json: valid traces for early-exit and refinement paths - fixtures/trace-invalid-order.json: malformed trace for negative testing - fixtures/refinement-*.txt: refinement blocks (good, ambiguous, invalid) - scripts/validate_graph_schema.py: reusable schema conformance checker - docs/verification-data.md: design rationale + quick-start commands Co-Authored-By: Claude <noreply@anthropic.com>
Phase B verification: installed graphifyy 0.8.42, ran 'graphify update .' on the DeepRefine-Skill codebase (361 nodes, 646 edges), and compared against synthetic fixtures. Changes: - Accept 'links' as edge key (graphify uses 'links', not 'edges') - Make input_tokens/output_tokens optional (absent in AST-only mode) - Add .venv/ and graphify-out/ to .gitignore (local dev artifacts) - Add docs/phase-b-report.md with full comparison and findings Key finding: 'graphify update' is AST-only (all EXTRACTED edges). INFERRED edges require 'graphify extract' + LLM API key. Synthetic fixtures include both types, covering both scenarios for harness testing. Co-Authored-By: Claude <noreply@anthropic.com>
- Add docs/feat-verification-fixtures.md — submission notes covering all three phases (synthetic data, real KG schema, Copilot CLI end-to-end) with quick-start guide for collaborators - Add .github/skills/ to .gitignore (deployment artifact, not source) Co-Authored-By: Claude <noreply@anthropic.com>
35dac75 to
49d1b49
Compare
Contributor
Author
WhatAdd a verification harness (fixtures + schema validator + docs) that lets anyone confirm the Python CLI and Reafiner loop work correctly in 2 minutes, no graphify required.
WhyBefore: test the harness → install graphify → configure API key → generate KG → run CLI blind. No fixtures, no schema checks, no documented procedure. After: What's Included
Each Fixture → Python GateEvery fixture maps to a hard-coded rule in
Quick Start (2 min)pip install -e .
# Schema
python scripts/validate_graph_schema.py fixtures/synth-minimal.json
# Trace validation
python -m deeprefine_skill.cli loop validate --trace-file fixtures/trace-valid-early-exit.json
python -m deeprefine_skill.cli loop validate --trace-file fixtures/trace-valid.json --refinement-file fixtures/refinement-good.txt
python -m deeprefine_skill.cli loop validate --trace-file fixtures/trace-invalid-order.json # should fail
# Review + Apply gates
mkdir -p /tmp/test_kb/graphify-out
cp fixtures/synth-minimal.json /tmp/test_kb/graphify-out/graph.json
python -m deeprefine_skill.cli review --refinement-file fixtures/refinement-good.txt --trace-file fixtures/trace-valid.json --project-root /tmp/test_kb
python -m deeprefine_skill.cli apply --refinement-file fixtures/refinement-good.txt --trace-file fixtures/trace-valid.json --project-root /tmp/test_kb
python -m deeprefine_skill.cli apply --refinement-file fixtures/refinement-ambiguous.txt --trace-file fixtures/trace-valid.json --project-root /tmp/test_kb # should fail
rm -rf /tmp/test_kb
Verification Summary
- Phase A — All CLI commands verified against synthetic 10-node KG
- Phase B — Validated against graphifyy 0.8.42 (361n/646e); found & fixed 3 schema differences
- Phase C — End-to-end Copilot CLI run; 4 Reafiner paths confirmed: refinement, early exit, apply, LOW gate |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add synthetic test data and tools for verifying the Reafiner harness (Level 0 trace validation + Level 1.5 evidence audit) without requiring LLM, graphify, or any agent platform.
What's Included
fixtures/synth-minimal.jsonfixtures/trace-valid-early-exit.jsonfixtures/trace-valid.jsonfixtures/trace-invalid-order.jsonfixtures/refinement-good.txtfixtures/refinement-ambiguous.txtfixtures/refinement-invalid.txt<refinement>block → parse errorscripts/validate_graph_schema.pydocs/verification-data.mdHow to Verify
No dependencies beyond
deeprefineCLI (already installed withpip install -e .):