Skip to content

[DX-4017] Skill chainlink-test-diagnosis + Improved Harness#22280

Open
kalverra wants to merge 12 commits intodevelopfrom
chainlinkTestDiagnosis
Open

[DX-4017] Skill chainlink-test-diagnosis + Improved Harness#22280
kalverra wants to merge 12 commits intodevelopfrom
chainlinkTestDiagnosis

Conversation

@kalverra
Copy link
Copy Markdown
Collaborator

@kalverra kalverra commented May 1, 2026

/chainlink-test-diagnosis Skill

  • Add the tools/test/.agents/skills/chainlink-test-diagnosis skill to run diagnose, find and fix unstable or slow tests in /chainlink.
  • Use go generate to copy the skill over to .claude/skills/

Example results: #22281

Improve Test Harness

  • Adds --parallel-iterations to diagnose command, so you can run non-CPU-intensive iterations much faster. Helpful for single tests.
  • Adds setup-testdb and remove-testdb commands to launch and kill persistent test DBs

Copilot AI review requested due to automatic review settings May 1, 2026 18:18
@kalverra kalverra requested review from a team as code owners May 1, 2026 18:18
@kalverra kalverra requested a review from matYang May 1, 2026 18:18
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

👋 kalverra, thanks for creating this pull request!

To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team.

Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks!

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

✅ No conflicts with other open PRs targeting develop

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Risk Rating: MEDIUM

This PR extends the tools/test harness to better support diagnosing flaky/slow tests (including parallelized diagnose iterations) and adds persistent test DB lifecycle commands, while also adding the new chainlink-test-diagnosis skill and syncing it into .claude/skills via go generate.

Changes:

  • Add skill content under tools/test/.agents/skills/chainlink-test-diagnosis and sync into tools/test/.claude/skills via go generate.
  • Enhance diagnose with --parallel-iterations, improved progress/output plumbing (internal/output), and more robust results directory naming/creation.
  • Add setup-testdb / remove-testdb commands and supporting persistent DB utilities.

Reviewed changes

Copilot reviewed 33 out of 34 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tools/test/main.go Adds go:generate skill sync helper (--sync-skills) to copy skills into .claude/skills.
tools/test/internal/runner/runner.go Refactors Diagnose execution, adds parallel-iteration runner, new progress digest output, and output plumbing.
tools/test/internal/runner/runner_test.go Updates tests for new Diagnose signature/output and adds coverage for parallel iteration runner behavior.
tools/test/internal/runner/diagnose_results_dir.go Includes p<N> in results dir naming for parallel iterations (with shortening phases).
tools/test/internal/runner/diagnose_progress.go Improves progress rendering, adds parallel-progress aggregation, and uses slices.Reverse.
tools/test/internal/runner/diagnose_progress_test.go Updates progress rendering assertions and adds coverage for parallel progress line rendering.
tools/test/internal/runner/analyze.go Refactors analysis pipeline, adds iteration digests, changes log-file reporting schema (logs vs log_files), and filename truncation/hash.
tools/test/internal/runner/analyze_test.go Adds DigestIterationJSONL tests and adjusts expectations due to report schema changes.
tools/test/internal/runner/analyze_files_test.go Updates log materialization tests for new ProblemLog schema and adds slow/log filename truncation tests.
tools/test/internal/runner/analyze_bench_test.go Adds benchmarks for Analyze/DigestIterationJSONL/AnalyzeResults using real-ish iteration logs.
tools/test/internal/runner/testdata/iteration-0.log.jsonl Adds real-ish gotest JSONL data for benchmarks.
tools/test/internal/runner/testdata/iteration-1.log.jsonl Adds real-ish gotest JSONL data for benchmarks.
tools/test/internal/runner/testdata/iteration-2.log.jsonl Adds real-ish gotest JSONL data for benchmarks.
tools/test/internal/output/output.go Introduces a centralized printer for human vs AI output and inline-progress rules.
tools/test/internal/output/output_test.go Adds unit tests for the new output printer behaviors.
tools/test/internal/db/db.go Refactors DB setup to accept an output printer, adds DB pool support for parallel diagnose, and introduces runner-facing Resource.
tools/test/internal/db/db_test.go Updates Ensure/EnsurePool tests for new signatures and adds/adjusts pool behavior assertions.
tools/test/internal/db/persistent.go Adds persistent Postgres container start + preparetest + removal helpers (docker label based).
tools/test/internal/db/persistent_test.go Adds tests for docker label filter and shell export escaping.
tools/test/internal/config/config.go Adds ParallelIterations config and default binding.
tools/test/internal/config/config_test.go Ensures parallel-iterations flag is bound/unmarshaled correctly.
tools/test/internal/cmd/root.go Updates CLI display name and skips DB pre-run for diagnose/testdb commands; wires output into DB Ensure.
tools/test/internal/cmd/root_test.go Updates command path expectations and adds config validation tests for parallel-iterations.
tools/test/internal/cmd/run.go Uses new output printer for teardown error reporting.
tools/test/internal/cmd/gotestsum.go Uses output printer for teardown error reporting (while still handling out being nil).
tools/test/internal/cmd/diagnose.go Adds --parallel-iterations, uses EnsurePool, and passes resources/output into runner.
tools/test/internal/cmd/testdb.go Adds setup-testdb and remove-testdb commands (including --eval/--shell).
tools/test/README.md Updates skill documentation link/name.
tools/test/AGENTS.md Updates goals/rules and documents internal/output convention.
tools/test/.claude/skills/chainlink-test-diagnosis/SKILL.md Adds skill doc (synced copy), currently references old report schema in places.
tools/test/.claude/skills/chainlink-test-diagnosis/eval-plan.md Adds skill evaluation harness plan (synced copy).
tools/test/.agents/skills/chainlink-test-diagnosis/SKILL.md Adds skill doc (source), currently references old report schema in places.
tools/test/.agents/skills/chainlink-test-diagnosis/eval-plan.md Adds skill evaluation harness plan (source).
.gitignore Ignores .claude/* while allowing tools/test/.claude/skills/ to be tracked.

Areas needing scrupulous human review

  • runDiagnoseIterations and DB pool lifecycle (tools/test/internal/runner/runner.go, tools/test/internal/db/db.go): concurrency/cancellation semantics and resource cleanup correctness under failures.
  • Report/log schema change (tools/test/internal/runner/analyze.go): ensures any downstream consumers (including the skill docs) are updated to the new logs shape and filename patterns.

Comment thread tools/test/internal/db/db_test.go Outdated
Comment thread tools/test/.claude/skills/chainlink-test-diagnosis/SKILL.md
Comment thread tools/test/.agents/skills/chainlink-test-diagnosis/SKILL.md
Comment thread tools/test/main.go
Comment thread tools/test/main.go Outdated
@trunk-io
Copy link
Copy Markdown

trunk-io Bot commented May 1, 2026

Static BadgeStatic BadgeStatic BadgeStatic Badge

View Full Report ↗︎Docs

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Risk Rating: HIGH

Adds a new chainlink-test-diagnosis agent skill and significantly expands the tools/test harness to improve diagnose throughput and operator ergonomics (parallel iterations + persistent Postgres helpers), while centralizing harness-owned output behavior.

Changes:

  • Adds --parallel-iterations support to diagnose, including new progress rendering and iteration digests.
  • Introduces persistent Postgres lifecycle commands (setup-testdb, remove-testdb) and a DB pool abstraction for parallel diagnose workers.
  • Adds skill files (source + generated copy), a go:generate sync path, and refactors harness output to internal/output.

Reviewed changes

Copilot reviewed 32 out of 33 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tools/test/main.go Adds go:generate skill sync and a --sync-skills mode to copy skill content into .claude/skills.
tools/test/internal/runner/testdata/iteration-0.log.jsonl Adds real-world JSONL fixture for analysis/benchmarks.
tools/test/internal/runner/testdata/iteration-1.log.jsonl Adds real-world JSONL fixture for analysis/benchmarks.
tools/test/internal/runner/testdata/iteration-2.log.jsonl Adds real-world JSONL fixture for analysis/benchmarks.
tools/test/internal/runner/runner_test.go Updates Diagnose signature usage and adds tests for parallel iterations + output header/footer behaviors.
tools/test/internal/runner/runner.go Implements parallel diagnose iteration execution, per-iteration digests, results dir creation, and output plumbing via output.Printer.
tools/test/internal/runner/diagnose_results_dir.go Includes parallel-iteration count in results dir naming (when applicable).
tools/test/internal/runner/diagnose_progress_test.go Updates progress-line expectations and adds tests for parallel progress rendering.
tools/test/internal/runner/diagnose_progress.go Adds parallel-progress tracking/rendering and refactors count formatting helpers.
tools/test/internal/runner/analyze_test.go Adds tests for DigestIterationJSONL and adjusts report-entry comparisons.
tools/test/internal/runner/analyze_files_test.go Updates log-file emission expectations (new ProblemLog format) and adds filename truncation tests.
tools/test/internal/runner/analyze_bench_test.go Adds benchmarks for Analyze/AnalyzeResults/Digest using real JSONL fixtures.
tools/test/internal/runner/analyze.go Refactors Analyze pipeline, adds ProblemLog, implements digest + compact iteration/log emission + filename truncation.
tools/test/internal/output/output_test.go Adds unit tests for the new output.Printer.
tools/test/internal/output/output.go Introduces output.Printer to centralize human vs --ai-output behavior and inline-progress capability.
tools/test/internal/db/persistent_test.go Adds tests for persistent DB label filtering and shell export quoting helpers.
tools/test/internal/db/persistent.go Implements persistent Postgres start/remove utilities and shell-export helpers.
tools/test/internal/db/db_test.go Updates DB Ensure API usage and adds tests for pool behavior and handle env.
tools/test/internal/db/db.go Adds DB pool abstraction, per-handle env exposure, and refactors Ensure to accept an output.Printer.
tools/test/internal/config/config_test.go Verifies parallel-iterations is bound into config.
tools/test/internal/config/config.go Adds ParallelIterations config field and default.
tools/test/internal/cmd/testdb.go Adds setup-testdb and remove-testdb commands.
tools/test/internal/cmd/run.go Routes teardown errors through output.Printer.
tools/test/internal/cmd/root_test.go Updates expected root command path and adds validation tests for parallel-iterations config.
tools/test/internal/cmd/root.go Skips global DB setup for diagnose/testdb commands and passes output.Printer into DB ensure.
tools/test/internal/cmd/gotestsum.go Routes teardown errors through output.Printer when available.
tools/test/internal/cmd/diagnose.go Adds --parallel-iterations, uses DB pool + new Diagnose API, and validates diagnose config.
tools/test/README.md Updates AI skill documentation pointer.
tools/test/AGENTS.md Updates skill path and documents internal/output as the canonical harness output mechanism.
tools/test/.claude/skills/chainlink-test-diagnosis/SKILL.md Adds generated Claude skill content for diagnose workflow.
tools/test/.agents/skills/chainlink-test-diagnosis/SKILL.md Adds source skill content for diagnose workflow.
README.md Documents the new tools/test flow and fixes a typo.
.gitignore Adjusts .claude ignore rules while allowing the tools/test skill copy to be committed.

Comment thread tools/test/internal/cmd/diagnose.go
Comment thread tools/test/main.go Outdated
Comment thread tools/test/internal/runner/analyze.go Outdated
)

func init() {
func requireANSIColor(t *testing.T) {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise these tests fail when run in terminals that don't support color (like ai-agent terminals)

@cl-sonarqube-production
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants