[DX-4017] Skill chainlink-test-diagnosis + Improved Harness#22280
[DX-4017] Skill chainlink-test-diagnosis + Improved Harness#22280
chainlink-test-diagnosis + Improved Harness#22280Conversation
|
👋 kalverra, thanks for creating this pull request! To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team. Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks! |
|
✅ No conflicts with other open PRs targeting |
There was a problem hiding this comment.
Pull request overview
Risk Rating: MEDIUM
This PR extends the tools/test harness to better support diagnosing flaky/slow tests (including parallelized diagnose iterations) and adds persistent test DB lifecycle commands, while also adding the new chainlink-test-diagnosis skill and syncing it into .claude/skills via go generate.
Changes:
- Add skill content under
tools/test/.agents/skills/chainlink-test-diagnosisand sync intotools/test/.claude/skillsviago generate. - Enhance
diagnosewith--parallel-iterations, improved progress/output plumbing (internal/output), and more robust results directory naming/creation. - Add
setup-testdb/remove-testdbcommands and supporting persistent DB utilities.
Reviewed changes
Copilot reviewed 33 out of 34 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/test/main.go | Adds go:generate skill sync helper (--sync-skills) to copy skills into .claude/skills. |
| tools/test/internal/runner/runner.go | Refactors Diagnose execution, adds parallel-iteration runner, new progress digest output, and output plumbing. |
| tools/test/internal/runner/runner_test.go | Updates tests for new Diagnose signature/output and adds coverage for parallel iteration runner behavior. |
| tools/test/internal/runner/diagnose_results_dir.go | Includes p<N> in results dir naming for parallel iterations (with shortening phases). |
| tools/test/internal/runner/diagnose_progress.go | Improves progress rendering, adds parallel-progress aggregation, and uses slices.Reverse. |
| tools/test/internal/runner/diagnose_progress_test.go | Updates progress rendering assertions and adds coverage for parallel progress line rendering. |
| tools/test/internal/runner/analyze.go | Refactors analysis pipeline, adds iteration digests, changes log-file reporting schema (logs vs log_files), and filename truncation/hash. |
| tools/test/internal/runner/analyze_test.go | Adds DigestIterationJSONL tests and adjusts expectations due to report schema changes. |
| tools/test/internal/runner/analyze_files_test.go | Updates log materialization tests for new ProblemLog schema and adds slow/log filename truncation tests. |
| tools/test/internal/runner/analyze_bench_test.go | Adds benchmarks for Analyze/DigestIterationJSONL/AnalyzeResults using real-ish iteration logs. |
| tools/test/internal/runner/testdata/iteration-0.log.jsonl | Adds real-ish gotest JSONL data for benchmarks. |
| tools/test/internal/runner/testdata/iteration-1.log.jsonl | Adds real-ish gotest JSONL data for benchmarks. |
| tools/test/internal/runner/testdata/iteration-2.log.jsonl | Adds real-ish gotest JSONL data for benchmarks. |
| tools/test/internal/output/output.go | Introduces a centralized printer for human vs AI output and inline-progress rules. |
| tools/test/internal/output/output_test.go | Adds unit tests for the new output printer behaviors. |
| tools/test/internal/db/db.go | Refactors DB setup to accept an output printer, adds DB pool support for parallel diagnose, and introduces runner-facing Resource. |
| tools/test/internal/db/db_test.go | Updates Ensure/EnsurePool tests for new signatures and adds/adjusts pool behavior assertions. |
| tools/test/internal/db/persistent.go | Adds persistent Postgres container start + preparetest + removal helpers (docker label based). |
| tools/test/internal/db/persistent_test.go | Adds tests for docker label filter and shell export escaping. |
| tools/test/internal/config/config.go | Adds ParallelIterations config and default binding. |
| tools/test/internal/config/config_test.go | Ensures parallel-iterations flag is bound/unmarshaled correctly. |
| tools/test/internal/cmd/root.go | Updates CLI display name and skips DB pre-run for diagnose/testdb commands; wires output into DB Ensure. |
| tools/test/internal/cmd/root_test.go | Updates command path expectations and adds config validation tests for parallel-iterations. |
| tools/test/internal/cmd/run.go | Uses new output printer for teardown error reporting. |
| tools/test/internal/cmd/gotestsum.go | Uses output printer for teardown error reporting (while still handling out being nil). |
| tools/test/internal/cmd/diagnose.go | Adds --parallel-iterations, uses EnsurePool, and passes resources/output into runner. |
| tools/test/internal/cmd/testdb.go | Adds setup-testdb and remove-testdb commands (including --eval/--shell). |
| tools/test/README.md | Updates skill documentation link/name. |
| tools/test/AGENTS.md | Updates goals/rules and documents internal/output convention. |
| tools/test/.claude/skills/chainlink-test-diagnosis/SKILL.md | Adds skill doc (synced copy), currently references old report schema in places. |
| tools/test/.claude/skills/chainlink-test-diagnosis/eval-plan.md | Adds skill evaluation harness plan (synced copy). |
| tools/test/.agents/skills/chainlink-test-diagnosis/SKILL.md | Adds skill doc (source), currently references old report schema in places. |
| tools/test/.agents/skills/chainlink-test-diagnosis/eval-plan.md | Adds skill evaluation harness plan (source). |
| .gitignore | Ignores .claude/* while allowing tools/test/.claude/skills/ to be tracked. |
Areas needing scrupulous human review
runDiagnoseIterationsand DB pool lifecycle (tools/test/internal/runner/runner.go,tools/test/internal/db/db.go): concurrency/cancellation semantics and resource cleanup correctness under failures.- Report/log schema change (
tools/test/internal/runner/analyze.go): ensures any downstream consumers (including the skill docs) are updated to the newlogsshape and filename patterns.
There was a problem hiding this comment.
Pull request overview
Risk Rating: HIGH
Adds a new chainlink-test-diagnosis agent skill and significantly expands the tools/test harness to improve diagnose throughput and operator ergonomics (parallel iterations + persistent Postgres helpers), while centralizing harness-owned output behavior.
Changes:
- Adds
--parallel-iterationssupport todiagnose, including new progress rendering and iteration digests. - Introduces persistent Postgres lifecycle commands (
setup-testdb,remove-testdb) and a DB pool abstraction for parallel diagnose workers. - Adds skill files (source + generated copy), a
go:generatesync path, and refactors harness output tointernal/output.
Reviewed changes
Copilot reviewed 32 out of 33 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/test/main.go | Adds go:generate skill sync and a --sync-skills mode to copy skill content into .claude/skills. |
| tools/test/internal/runner/testdata/iteration-0.log.jsonl | Adds real-world JSONL fixture for analysis/benchmarks. |
| tools/test/internal/runner/testdata/iteration-1.log.jsonl | Adds real-world JSONL fixture for analysis/benchmarks. |
| tools/test/internal/runner/testdata/iteration-2.log.jsonl | Adds real-world JSONL fixture for analysis/benchmarks. |
| tools/test/internal/runner/runner_test.go | Updates Diagnose signature usage and adds tests for parallel iterations + output header/footer behaviors. |
| tools/test/internal/runner/runner.go | Implements parallel diagnose iteration execution, per-iteration digests, results dir creation, and output plumbing via output.Printer. |
| tools/test/internal/runner/diagnose_results_dir.go | Includes parallel-iteration count in results dir naming (when applicable). |
| tools/test/internal/runner/diagnose_progress_test.go | Updates progress-line expectations and adds tests for parallel progress rendering. |
| tools/test/internal/runner/diagnose_progress.go | Adds parallel-progress tracking/rendering and refactors count formatting helpers. |
| tools/test/internal/runner/analyze_test.go | Adds tests for DigestIterationJSONL and adjusts report-entry comparisons. |
| tools/test/internal/runner/analyze_files_test.go | Updates log-file emission expectations (new ProblemLog format) and adds filename truncation tests. |
| tools/test/internal/runner/analyze_bench_test.go | Adds benchmarks for Analyze/AnalyzeResults/Digest using real JSONL fixtures. |
| tools/test/internal/runner/analyze.go | Refactors Analyze pipeline, adds ProblemLog, implements digest + compact iteration/log emission + filename truncation. |
| tools/test/internal/output/output_test.go | Adds unit tests for the new output.Printer. |
| tools/test/internal/output/output.go | Introduces output.Printer to centralize human vs --ai-output behavior and inline-progress capability. |
| tools/test/internal/db/persistent_test.go | Adds tests for persistent DB label filtering and shell export quoting helpers. |
| tools/test/internal/db/persistent.go | Implements persistent Postgres start/remove utilities and shell-export helpers. |
| tools/test/internal/db/db_test.go | Updates DB Ensure API usage and adds tests for pool behavior and handle env. |
| tools/test/internal/db/db.go | Adds DB pool abstraction, per-handle env exposure, and refactors Ensure to accept an output.Printer. |
| tools/test/internal/config/config_test.go | Verifies parallel-iterations is bound into config. |
| tools/test/internal/config/config.go | Adds ParallelIterations config field and default. |
| tools/test/internal/cmd/testdb.go | Adds setup-testdb and remove-testdb commands. |
| tools/test/internal/cmd/run.go | Routes teardown errors through output.Printer. |
| tools/test/internal/cmd/root_test.go | Updates expected root command path and adds validation tests for parallel-iterations config. |
| tools/test/internal/cmd/root.go | Skips global DB setup for diagnose/testdb commands and passes output.Printer into DB ensure. |
| tools/test/internal/cmd/gotestsum.go | Routes teardown errors through output.Printer when available. |
| tools/test/internal/cmd/diagnose.go | Adds --parallel-iterations, uses DB pool + new Diagnose API, and validates diagnose config. |
| tools/test/README.md | Updates AI skill documentation pointer. |
| tools/test/AGENTS.md | Updates skill path and documents internal/output as the canonical harness output mechanism. |
| tools/test/.claude/skills/chainlink-test-diagnosis/SKILL.md | Adds generated Claude skill content for diagnose workflow. |
| tools/test/.agents/skills/chainlink-test-diagnosis/SKILL.md | Adds source skill content for diagnose workflow. |
| README.md | Documents the new tools/test flow and fixes a typo. |
| .gitignore | Adjusts .claude ignore rules while allowing the tools/test skill copy to be committed. |
| ) | ||
|
|
||
| func init() { | ||
| func requireANSIColor(t *testing.T) { |
There was a problem hiding this comment.
Otherwise these tests fail when run in terminals that don't support color (like ai-agent terminals)
|





/chainlink-test-diagnosisSkilltools/test/.agents/skills/chainlink-test-diagnosisskill to rundiagnose, find and fix unstable or slow tests in /chainlink.go generateto copy the skill over to.claude/skills/Example results: #22281
Improve Test Harness
--parallel-iterationstodiagnosecommand, so you can run non-CPU-intensive iterations much faster. Helpful for single tests.setup-testdbandremove-testdbcommands to launch and kill persistent test DBs