Skip to content

PDX-490: feat(mcp) — error_category + retryable on test-run failures (Thread D)#186

Merged
mrdailey99 merged 2 commits into
developfrom
feature/PDX-490-error-category-retryable
May 20, 2026
Merged

PDX-490: feat(mcp) — error_category + retryable on test-run failures (Thread D)#186
mrdailey99 merged 2 commits into
developfrom
feature/PDX-490-error-category-retryable

Conversation

@mrdailey99
Copy link
Copy Markdown
Collaborator

Summary

Adds two optional, additive fields to both provar_automation_testrun.steps[] (JUnitStepResult) and provar_testrun_rca.failures[] (FailureReport):

  • error_category — one of INFRASTRUCTURE | ASSERTION | LOCATOR | TIMEOUT | OTHER. Absent when no known pattern matches.
  • retryabletrue only for INFRASTRUCTURE (Connection reset / ECONNRESET / socket hang up / Failed to read client socket message) and TIMEOUT (TimeoutException). false for LOCATOR (NoSuchElementException), ASSERTION (AssertionException), and OTHER (SessionNotCreatedException / WebDriverException / ClassNotFoundException / LicenseException / InvalidPasswordException). Absent when error_category is absent.

Identical classifier logic lives in both files (classifyErrorCategory in rcaTools.ts, classifyStepErrorCategory in antTools.ts) so a consumer sees the same label regardless of whether they consume the testrun output or the RCA report.

The existing root_cause_category (17 fine-grained buckets, drives recommendation text) is unchanged — error_category is an additional coarse-grained classifier that exists to drive automated retry policy.

Why

Today an agent calling provar_automation_testrun or provar_testrun_rca has no machine-friendly retry hint — to decide whether a failure is transient or deterministic, it has to re-parse human-readable strings. That leads to two failure modes:

  • agents retry locator failures forever (waste of time, the DOM has not changed),
  • agents give up on a single ECONNRESET blip that would succeed on the next attempt.

error_category + retryable give the agent one machine-checkable boolean to gate retries on.

Scope

  • src/mcp/tools/rcaTools.ts — adds ErrorCategory type, classifyErrorCategory(), isRetryable(), fields on FailureReport, population inside buildFailureReports, tool description update.
  • src/mcp/tools/antTools.ts — adds JUnitErrorCategory type, classifyStepErrorCategory(), isStepRetryable(), fields on JUnitStepResult, population inside extractStepsFromJUnit.
  • src/mcp/tools/automationTools.ts — tool-description update for provar_automation_testrun.
  • docs/mcp.md — documents the new fields on both tools.
  • test/unit/mcp/rcaTools.test.ts — 8 new tests covering all five categories, the "no match" case, the OTHER case, and the additive-no-breaking-change case.
  • test/unit/mcp/antTools.test.ts — 6 new tests on parseJUnitResults covering all categories plus the no-match and pass-step cases.

Cross-thread coordination

[touches antTools.ts] — Thread B (PDX-491 H1) will also touch antTools.ts in a sibling PR — specifically the body of parseJUnitResults / extractStepsFromJUnit (the parser body). The edits in this PR are isolated to:

  • the JUnitStepResult interface declaration (one new union type + two optional fields), and
  • the field-population block inside the per-testcase loop in extractStepsFromJUnit (an additive if (errorMessage) { ... step.error_category = ...; step.retryable = ...; }).

Both edits are additive and confined to small contiguous spans, so a clean rebase with Thread B's PR should be straightforward.

Test plan

  • node ../../../node_modules/typescript/bin/tsc -p . --pretty --noEmit — clean (no errors)
  • node ../../../node_modules/eslint/bin/eslint.js on every changed file — clean (no warnings)
  • node_modules/.bin/nyc node_modules/.bin/mocha "test/**/*.test.ts" — 1169 passing, 1 pending
  • node scripts/lint-script-names.cjs — OK
  • No external/customer-facing docs touched (docs/provar-mcp-public-docs.md, docs/university-of-provar-mcp-course.md) — the additive field rollout will be flagged for the Provar team to mirror in customer docs.

🤖 Generated with Claude Code

…run failures

RCA: Consumers of provar_automation_testrun.steps[] and provar_testrun_rca.failures[] currently have no machine-friendly retry hint and must re-parse human-readable strings to decide whether a failure is transient or deterministic, so agents either retry locator failures forever or give up on a single ECONNRESET blip.

Fix: Add optional error_category (INFRASTRUCTURE|ASSERTION|LOCATOR|TIMEOUT|OTHER) and retryable (boolean) fields to both FailureReport (rcaTools.ts) and JUnitStepResult (antTools.ts), populated by an additive pattern-match classifier; retryable=true only for INFRASTRUCTURE and TIMEOUT, undefined when no pattern matches — non-breaking — with tool descriptions and docs/mcp.md updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 19, 2026 21:20
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 19, 2026

Quality Orchestrator

🟢 LOW · 6 / 100 · All changed files have mapped tests.


🧪 Tests to Run · Running 3 of 51 tests

  • unit/mcp/antTools.test.ts
  • unit/mcp/automationTools.test.ts
  • unit/mcp/rcaTools.test.ts
▶ Run command
npx vitest run \
  unit/mcp/antTools.test.ts \
  unit/mcp/automationTools.test.ts \
  unit/mcp/rcaTools.test.ts

⚡ quality-orchestrator  ·  /qo stub <file>  ·  qo analyze-local

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds two additive, optional fields (error_category, retryable) to MCP tool outputs for test-run failures so downstream consumers can make machine-friendly retry decisions without re-parsing human-readable error text.

Changes:

  • Add coarse-grained error classification + retryability flags to provar_testrun_rca failure reports and to provar_automation_testrun step results.
  • Update MCP tool descriptions and docs to describe the new fields.
  • Add unit test coverage for all categories + “no match” behavior in both RCA and JUnit parsing.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/mcp/tools/rcaTools.ts Adds error_category/retryable to FailureReport and populates them from failure text; updates tool description.
src/mcp/tools/antTools.ts Adds error_category/retryable to JUnitStepResult and populates them during JUnit step extraction.
src/mcp/tools/automationTools.ts Updates provar_automation_testrun tool description to mention the new step fields.
docs/mcp.md Documents the new fields for both tools and updates example output.
test/unit/mcp/rcaTools.test.ts Adds tests validating RCA failure error_category/retryable behavior.
test/unit/mcp/antTools.test.ts Adds tests validating JUnit step error_category/retryable behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/mcp/tools/rcaTools.ts Outdated
Comment on lines +751 to +753
'Each failure includes optional error_category (INFRASTRUCTURE|ASSERTION|LOCATOR|TIMEOUT|OTHER)',
'and retryable (boolean) fields when the failure text matches a known pattern — INFRASTRUCTURE/TIMEOUT',
'are flagged retryable, others are not.',
Comment thread docs/mcp.md Outdated
| `test_case` | Test case filename from JUnit `<testcase name>` |
| `error_class` | Extracted exception class name |
| `error_message` | First 500 chars of failure/error text |
| `root_cause_category` | One of 12 categories (see table below) |
…ix root_cause_category count

RCA: Copilot review on PR #186 flagged two doc/description inconsistencies — (1) the provar_testrun_rca tool description implied error_category/retryable appear in mode="failures" output, but mode="failures" only emits { testItemId, title, errorMessage } and the new fields live on FailureReport entries in mode="rca" only; (2) docs/mcp.md FailureReport table said root_cause_category is "One of 12 categories" while the enumerated list contains 17 buckets (and the new paragraph added in this PR explicitly says "17 buckets"). These are doc/description mismatches, not runtime behaviour bugs.

Fix: Reword the rcaTools.ts tool description so error_category/retryable are explicitly scoped to failures[] entries in mode="rca", with a "NOT included in mode=failures output" clarifier so agents reading the description cold pick the right mode. Update docs/mcp.md root_cause_category row from "One of 12 categories (see table below)" to "One of 17 categories (see list below)" — count now matches the enumerated list (DRIVER_VERSION_MISMATCH, LOCATOR_STALE, TIMEOUT, ASSERTION_FAILED, CREDENTIAL_FAILURE, MISSING_CALLABLE, METADATA_CACHE, PAGE_OBJECT_COMPILE, CONNECTION_REFUSED, DATA_SETUP, LICENSE_INVALID, SALESFORCE_VALIDATION, SALESFORCE_PICKLIST, SALESFORCE_REFERENCE, SALESFORCE_ACCESS, SALESFORCE_TRIGGER, UNKNOWN = 17). Also corrected "see table below" → "see list below" since the categories are an inline comma-separated list, not a table. Verified with yarn compile (clean), yarn lint (clean), and full mocha suite (1169 passing).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator Author

@mrdailey99 mrdailey99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 669b517: rca tool description now correctly scopes error_category/retryable to mode='rca'; root_cause_category count corrected to 17.

@mrdailey99 mrdailey99 merged commit c08bc72 into develop May 20, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants