Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 23 additions & 14 deletions docs/mcp.md
Original file line number Diff line number Diff line change
Expand Up @@ -1386,12 +1386,17 @@ After each run, the tool scans the results directory for JUnit XML files and add
```json
"steps": [
{ "testItemId": "1", "title": "TC-Login-001-LoginAndVerify.testcase", "status": "pass" },
{ "testItemId": "2", "title": "TC-Login-002-ForgotPassword.testcase", "status": "fail", "errorMessage": "Execution failed: Element not found" }
{ "testItemId": "2", "title": "TC-Login-002-ForgotPassword.testcase", "status": "fail", "errorMessage": "TimeoutException: page did not load", "error_category": "TIMEOUT", "retryable": true }
]
```

Each entry represents one test case. `status` is `"pass"`, `"fail"`, or `"skip"`. If the results directory cannot be located or contains no JUnit XML, `details.warning` explains why and `steps` is absent.

Failed steps may include two optional classification fields:

- `error_category` — one of `INFRASTRUCTURE`, `ASSERTION`, `LOCATOR`, `TIMEOUT`, `OTHER`, set when the failure text matches a known pattern.
- `retryable` — `true` when `error_category` is `INFRASTRUCTURE` or `TIMEOUT` (transient causes), `false` for `ASSERTION`/`LOCATOR`/`OTHER`. Absent when no pattern matched.

**Error codes:** `AUTOMATION_TESTRUN_FAILED`, `SF_NOT_FOUND`

---
Expand Down Expand Up @@ -1535,22 +1540,26 @@ Use `mode="failures"` when you only need the list of failing test case names wit

**`FailureReport` fields (mode=rca only):**

| Field | Description |
| --------------------- | -------------------------------------------------------- |
| `test_case` | Test case filename from JUnit `<testcase name>` |
| `error_class` | Extracted exception class name |
| `error_message` | First 500 chars of failure/error text |
| `root_cause_category` | One of 12 categories (see table below) |
| `root_cause_summary` | Human-readable cause description |
| `recommendation` | Suggested fix action |
| `page_object` | Extracted from `Page Object: ...` pattern, or `null` |
| `operation` | Extracted from `operation: ...` pattern, or `null` |
| `report_html` | Path to per-test HTML report if found, else `null` |
| `screenshot_dir` | Path to `Artifacts/` directory if it exists, else `null` |
| `pre_existing` | `true` if the same test failed in a prior Increment run |
| Field | Description |
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `test_case` | Test case filename from JUnit `<testcase name>` |
| `error_class` | Extracted exception class name |
| `error_message` | First 500 chars of failure/error text |
| `root_cause_category` | One of 17 categories (see list below) |
| `root_cause_summary` | Human-readable cause description |
| `recommendation` | Suggested fix action |
| `page_object` | Extracted from `Page Object: ...` pattern, or `null` |
| `operation` | Extracted from `operation: ...` pattern, or `null` |
| `report_html` | Path to per-test HTML report if found, else `null` |
| `screenshot_dir` | Path to `Artifacts/` directory if it exists, else `null` |
| `pre_existing` | `true` if the same test failed in a prior Increment run |
| `error_category` | Optional. One of `INFRASTRUCTURE` \| `ASSERTION` \| `LOCATOR` \| `TIMEOUT` \| `OTHER`. Absent when no known pattern matched. |
| `retryable` | Optional. `true` when `error_category` is `INFRASTRUCTURE` or `TIMEOUT` (transient causes); `false` otherwise. Absent when `error_category` is absent. |

**Root cause categories:** `DRIVER_VERSION_MISMATCH`, `LOCATOR_STALE`, `TIMEOUT`, `ASSERTION_FAILED`, `CREDENTIAL_FAILURE`, `MISSING_CALLABLE`, `METADATA_CACHE`, `PAGE_OBJECT_COMPILE`, `CONNECTION_REFUSED`, `DATA_SETUP`, `LICENSE_INVALID`, `SALESFORCE_VALIDATION`, `SALESFORCE_PICKLIST`, `SALESFORCE_REFERENCE`, `SALESFORCE_ACCESS`, `SALESFORCE_TRIGGER`, `UNKNOWN`

**Error category vs. root cause category:** `root_cause_category` is fine-grained (17 buckets) and drives the human-readable `recommendation`. `error_category` is coarse-grained (5 buckets) and drives automated retry policy via `retryable`. The two are independent classifiers over the same failure text — both may be set on the same failure.

Salesforce DML error categories (`SALESFORCE_*`) represent test-data failures — they appear in `failures[].root_cause_category` but are **not** included in `infrastructure_issues`.

**Error codes:** `RESULTS_NOT_CONFIGURED`, `PATH_NOT_ALLOWED`, `PATH_TRAVERSAL`
Expand Down
43 changes: 42 additions & 1 deletion src/mcp/tools/antTools.ts
Original file line number Diff line number Diff line change
Expand Up @@ -979,11 +979,46 @@ function finalizeAnt(

// ── JUnit XML step parsing ────────────────────────────────────────────────────

export type JUnitErrorCategory = 'INFRASTRUCTURE' | 'ASSERTION' | 'LOCATOR' | 'TIMEOUT' | 'OTHER';

export interface JUnitStepResult {
testItemId: string;
title: string;
status: 'pass' | 'fail' | 'skip';
errorMessage?: string;
error_category?: JUnitErrorCategory;
retryable?: boolean;
}

/**
* Classify a failure message into a coarse-grained category used for retry decisions.
* Mirrors the classifier in rcaTools.ts (PDX-490) so a downstream consumer sees the
* same labelling whether they consume `provar_automation_testrun.steps[]` or
* `provar_testrun_rca.failures[]`.
*
* Returns `undefined` when no pattern matches.
*/
export function classifyStepErrorCategory(errorText: string): JUnitErrorCategory | undefined {
if (/Connection reset|Failed to read client socket message|socket hang up|ECONNRESET/i.test(errorText)) {
return 'INFRASTRUCTURE';
}
if (/NoSuchElementException/i.test(errorText)) return 'LOCATOR';
if (/TimeoutException/i.test(errorText)) return 'TIMEOUT';
if (/AssertionException/i.test(errorText)) return 'ASSERTION';
if (
/SessionNotCreatedException|WebDriverException|ClassNotFoundException|LicenseException|InvalidPasswordException/i.test(
errorText
)
) {
return 'OTHER';
}
return undefined;
}

/** Only transient categories (INFRASTRUCTURE, TIMEOUT) are retryable. */
export function isStepRetryable(category: JUnitErrorCategory | undefined): boolean | undefined {
if (category === undefined) return undefined;
return category === 'INFRASTRUCTURE' || category === 'TIMEOUT';
}

export interface JUnitParseResult {
Expand Down Expand Up @@ -1043,7 +1078,13 @@ function extractStepsFromJUnit(parsed: Record<string, unknown>): JUnitStepResult

const errorMessage = extractFailureText(tc['failure'] ?? tc['error']);
const step: JUnitStepResult = { testItemId: String(idx), title, status };
if (errorMessage) step.errorMessage = errorMessage;
if (errorMessage) {
step.errorMessage = errorMessage;
const error_category = classifyStepErrorCategory(errorMessage);
const retryable = isStepRetryable(error_category);
if (error_category !== undefined) step.error_category = error_category;
if (retryable !== undefined) step.retryable = retryable;
}
steps.push(step);
}
}
Expand Down
2 changes: 2 additions & 0 deletions src/mcp/tools/automationTools.ts
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,8 @@ export function registerAutomationTestRun(server: McpServer, config: ServerConfi
'Output buffer: a 50 MB maxBuffer is set so ENOBUFS on verbose Provar runs is now rare.',
'If ENOBUFS still occurs (extremely verbose logging), run `sf provar automation test run --json` directly in the terminal and pipe or tail the output instead of retrying this tool.',
'Typical local AI loop: config.load → compile → testrun → inspect results.',
'Each failed step in `steps[]` may include optional error_category (INFRASTRUCTURE|ASSERTION|LOCATOR|TIMEOUT|OTHER)',
'and retryable (boolean) fields when the failure text matches a known pattern — use these to drive automated retry policy.',
].join(' '),
'Run local Provar tests via sf CLI; requires config_load first.'
),
Expand Down
57 changes: 55 additions & 2 deletions src/mcp/tools/rcaTools.ts
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ interface LocateResult {
resolution_source: string;
}

type ErrorCategory = 'INFRASTRUCTURE' | 'ASSERTION' | 'LOCATOR' | 'TIMEOUT' | 'OTHER';

interface FailureReport {
test_case: string;
error_class: string | null;
Expand All @@ -44,6 +46,48 @@ interface FailureReport {
report_html: string | null;
screenshot_dir: string | null;
pre_existing: boolean;
error_category?: ErrorCategory;
retryable?: boolean;
}

/**
* Classify a failure message into a structured error category for retry decisions.
*
* Categories are coarse-grained and intended to drive automated retry policy
* downstream (e.g. retry INFRASTRUCTURE/TIMEOUT, never retry ASSERTION/LOCATOR).
*
* Returns `undefined` when no pattern matches — callers should leave the field unset.
*/
export function classifyErrorCategory(errorText: string): ErrorCategory | undefined {
// INFRASTRUCTURE — transient network / socket / browser-process failures.
if (/Connection reset|Failed to read client socket message|socket hang up|ECONNRESET/i.test(errorText)) {
return 'INFRASTRUCTURE';
}
// LOCATOR — page object selector no longer matches the rendered DOM.
if (/NoSuchElementException/i.test(errorText)) return 'LOCATOR';
// TIMEOUT — element or operation did not complete in time.
if (/TimeoutException/i.test(errorText)) return 'TIMEOUT';
// ASSERTION — explicit assertion failure raised by the test or framework.
if (/AssertionException/i.test(errorText)) return 'ASSERTION';
// OTHER — known exception class but not fitting the four primary buckets.
if (
/SessionNotCreatedException|WebDriverException|ClassNotFoundException|LicenseException|InvalidPasswordException/i.test(
errorText
)
) {
return 'OTHER';
}
return undefined;
}

/**
* A failure is retryable only when the underlying cause is transient.
* INFRASTRUCTURE (network blips) and TIMEOUT (slow page) can succeed on retry;
* ASSERTION / LOCATOR / OTHER are deterministic and should not be retried.
*/
export function isRetryable(category: ErrorCategory | undefined): boolean | undefined {
if (category === undefined) return undefined;
return category === 'INFRASTRUCTURE' || category === 'TIMEOUT';
}

// ── Root cause classification ─────────────────────────────────────────────────
Expand Down Expand Up @@ -667,7 +711,9 @@ function buildFailureReports(
const poMatch = /Page Object:\s*([\w.]+)/i.exec(failureText);
const opMatch = /operation:\s*(\w+)/i.exec(failureText);
const matchingHtml = htmlFiles.find((f) => path.basename(f) === `${tc.name}.html`);
reports.push({
const error_category = classifyErrorCategory(failureText);
const retryable = isRetryable(error_category);
const report: FailureReport = {
test_case: tc.name,
error_class,
error_message: failureText.slice(0, 500),
Expand All @@ -679,7 +725,10 @@ function buildFailureReports(
report_html: matchingHtml ?? null,
screenshot_dir: screenshotDir,
pre_existing: priorFailed.has(tc.name),
});
};
if (error_category !== undefined) report.error_category = error_category;
if (retryable !== undefined) report.retryable = retryable;
reports.push(report);
}
return reports;
}
Expand All @@ -699,6 +748,10 @@ export function registerTestRunRca(server: McpServer, config: ServerConfig): voi
'Use mode="failures" to get a lightweight array of failed test cases',
'([{ testItemId, title, errorMessage }]) without the full RCA classification — useful when you',
'need failure names quickly without loading the HTML report.',
'In mode="rca" (default), each entry in failures[] additionally includes optional error_category',
'(INFRASTRUCTURE|ASSERTION|LOCATOR|TIMEOUT|OTHER) and retryable (boolean) fields when the failure',
'text matches a known pattern — INFRASTRUCTURE/TIMEOUT are flagged retryable, others are not.',
'These fields are NOT included in mode="failures" output.',
].join(' '),
'Parse a Provar test run JUnit.xml and produce an RCA report with failure classification.'
),
Expand Down
51 changes: 51 additions & 0 deletions test/unit/mcp/antTools.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -847,4 +847,55 @@ describe('parseJUnitResults', () => {
assert.ok(result.steps[0].errorMessage?.includes('Execution failed'));
assert.ok(result.steps[0].errorMessage?.includes('stack trace here'));
});

// ── PDX-490: error_category + retryable on step results ─────────────────────

function writeFailureJunit(dir: string, failureBody: string): void {
const xml = `<?xml version="1.0"?><testsuite><testcase name="T1"><failure message="fail">${failureBody}</failure></testcase></testsuite>`;
fs.writeFileSync(path.join(dir, 'JUnit.xml'), xml);
}

it('populates error_category=INFRASTRUCTURE and retryable=true for Connection reset', () => {
writeFailureJunit(junitTmpDir, 'Connection reset by peer while reading response');
const result = parseJUnitResults(junitTmpDir);
assert.equal(result.steps[0].error_category, 'INFRASTRUCTURE');
assert.equal(result.steps[0].retryable, true);
});

it('populates error_category=LOCATOR and retryable=false for NoSuchElementException', () => {
writeFailureJunit(junitTmpDir, 'NoSuchElementException: Unable to locate element');
const result = parseJUnitResults(junitTmpDir);
assert.equal(result.steps[0].error_category, 'LOCATOR');
assert.equal(result.steps[0].retryable, false);
});

it('populates error_category=TIMEOUT and retryable=true for TimeoutException', () => {
writeFailureJunit(junitTmpDir, 'TimeoutException: operation did not complete');
const result = parseJUnitResults(junitTmpDir);
assert.equal(result.steps[0].error_category, 'TIMEOUT');
assert.equal(result.steps[0].retryable, true);
});

it('populates error_category=ASSERTION and retryable=false for AssertionException', () => {
writeFailureJunit(junitTmpDir, 'AssertionException: expected X but was Y');
const result = parseJUnitResults(junitTmpDir);
assert.equal(result.steps[0].error_category, 'ASSERTION');
assert.equal(result.steps[0].retryable, false);
});

it('leaves error_category and retryable undefined when no pattern matches', () => {
writeFailureJunit(junitTmpDir, 'something completely unrecognised XYZ_BANANA');
const result = parseJUnitResults(junitTmpDir);
assert.equal(result.steps[0].error_category, undefined);
assert.equal(result.steps[0].retryable, undefined);
});

it('does not set error_category or retryable on passing steps', () => {
const xml = '<?xml version="1.0"?><testsuite><testcase name="OK"/></testsuite>';
fs.writeFileSync(path.join(junitTmpDir, 'JUnit.xml'), xml);
const result = parseJUnitResults(junitTmpDir);
assert.equal(result.steps[0].status, 'pass');
assert.equal(result.steps[0].error_category, undefined);
assert.equal(result.steps[0].retryable, undefined);
});
});
Loading
Loading