Skip to content

PDX-492: feat(mcp) — provar_org_describe tool, H2a (Thread F)#188

Merged
mrdailey99 merged 3 commits into
developfrom
feature/PDX-492-org-describe-tool
May 20, 2026
Merged

PDX-492: feat(mcp) — provar_org_describe tool, H2a (Thread F)#188
mrdailey99 merged 3 commits into
developfrom
feature/PDX-492-org-describe-tool

Conversation

@mrdailey99
Copy link
Copy Markdown
Collaborator

Context

Adds provar_org_describe, a read-only MCP tool that surfaces cached Salesforce describe data from the Provar IDE workspace .metadata directory. This is H2a of the PDX-H2 plan (Thread F, single PR). The sibling H2b PR in Thread C will consume this tool's output to populate provar_testcase_generate field-type hints.

Why

provar_testcase_generate currently has no source of truth for which fields on a Salesforce object are required and what their types are. Agents either guess (brittle), hard-code names, or call the live SF API (slow + auth-dependent). The Provar IDE already caches describe data on disk after a connection is loaded — this PR exposes that cache as a read-only MCP tool so it becomes useful outside the IDE.

Workspace discovery heuristic

The tool walks three candidate directories in order; the first one that exists wins:

  1. <parent-of-project>/workspace-<basename>/ — sibling workspace pattern (default for Provar IDE in this layout).
  2. <parent-of-project>/Provar_Workspaces/workspace-<name-dashes>/ — shared Provar_Workspaces directory.
  3. ~/Provar/workspace-<name-dashes>/ — user-home fallback.

<name-dashes> = basename(project_path).trim().replace(/\s+/g, '-').toLowerCase().

Cache schema (for H2b consumers)

No prior .metadata reader exists in the codebase, so this PR designs the schema H2b should produce/consume. It is intentionally simple — one file per object, JSON preferred:

// <workspace>/.metadata/<connection_name>/<ObjectApiName>.json
{
  "name": "Account",
  "fields": [
    { "name": "Name",  "type": "string", "defaultValue": null, "nillable": false },
    { "name": "Phone", "type": "phone",  "defaultValue": null, "nillable": true  }
  ]
}

The tool also falls back to .xml and .object (legacy CustomObject metadata) files for migration paths.

Output shape

{
  workspace_path: string | null;       // null when no workspace discovered
  cache_age_ms: number | null;         // mtime delta of cache dir
  objects: Array<{
    name: string;
    exists: boolean | null;            // null when cache missing entirely
    required_fields: Array<{ name, type, default_value, nillable }>;
    field_count: number;
  }>;
  details?: { suggestion: string };    // populated on cache-miss
}

Cache-miss behaviour

Returns a structured response (not isError) with details.suggestion telling the agent how to recover:

"Open this project in Provar IDE and load the 'MyOrg' connection, or pass field-type hints inline to provar_testcase_generate."

This is the same advisory shape provar_properties_read uses for divergence warnings, so consumers don't need a special error path.

Path safety

  • assertPathAllowed is called on both project_path and the composed connection_dir so a connection_name like ../../etc cannot escape the workspace.
  • connection_name is additionally rejected outright if it contains a path separator or .. segment (returns PATH_TRAVERSAL).

Files changed

  • src/mcp/tools/orgDescribeTools.ts — new tool
  • src/mcp/server.ts — registered under existing inspect tool group
  • test/unit/mcp/orgDescribeTools.test.ts — 14 unit tests, all 7 plan scenarios
  • scripts/mcp-smoke.cjs — 2 new RPC calls (happy path + cache miss)
  • docs/mcp.md — new "Org metadata access" section with TOC entry
  • docs/mcp-pilot-guide.md — new Scenario 13 (org-aware generation)

Test plan

  • node_modules/.bin/tsc -p . — clean compile
  • yarn lint — clean
  • node_modules/.bin/mocha "test/**/*.test.ts" — 1169 passing, 1 pending (baseline)
  • node_modules/.bin/mocha "test/unit/mcp/orgDescribeTools.test.ts" — 14 passing covering scenarios (a)–(g)
  • node scripts/mcp-smoke.cjs --profile inspect — both new entries PASS
  • Full smoke run — 57 passing, 0 failed
  • Reviewer to verify docs render in GitHub preview

Out of scope (H2b — sibling Thread C PR)

  • Adding field_type_hints / required_fields_hint to provar_testcase_generate
  • Wiring provar_org_describe output into the generator

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

…etadata cache)

Introduces provar_org_describe, a read-only MCP tool that surfaces cached
Salesforce describe data from the Provar IDE workspace .metadata directory
so authoring tools (provar_testcase_generate) can produce field-correct
data steps without a live SF API call.

Workspace discovery walks three candidates in order:
  1. <parent>/workspace-<basename>/            (sibling pattern)
  2. <parent>/Provar_Workspaces/workspace-<name-dashes>/
  3. ~/Provar/workspace-<name-dashes>/         (user-home fallback)

Returns a structured cache-miss response with details.suggestion when the
connection cache is absent, so the agent can either prompt the user to
load the connection in Provar IDE or fall back to inline field hints.

Registered under the existing 'inspect' tool group. H2b (sibling thread)
consumes this tool's output to populate generator hints.

RCA: provar_testcase_generate had no source of truth for which fields on
a Salesforce object are required and what their types are. Agents either
guessed (producing brittle tests), hard-coded names, or called the live
SF API (slow + auth-dependent). The Provar IDE already caches describe
data on disk after a connection is loaded — this PR exposes that cache
as a read-only MCP tool so the cache becomes useful outside the IDE.

Fix: New tool src/mcp/tools/orgDescribeTools.ts with strict path-policy
checks on both project_path and connection_name (separator/traversal
rejected). Cache schema is one file per object (.json preferred, .xml /
.object accepted) so the existing IDE writer needs no change. Cache miss
returns a stable shape with suggestion rather than an isError response,
so callers do not need a try/catch path. 14 unit tests cover all 7 plan
scenarios (workspace discovery, fallback, cache miss, path policy, happy
path, field_filter, objects filter). Two smoke entries cover happy + miss.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 19, 2026 21:46
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 19, 2026

Quality Orchestrator

🟢 LOW · 4 / 100 · All changed files have mapped tests.


🧪 Tests to Run · Running 2 of 54 tests

  • unit/mcp/server.test.ts
  • unit/mcp/orgDescribeTools.test.ts
▶ Run command
npx vitest run \
  unit/mcp/server.test.ts \
  unit/mcp/orgDescribeTools.test.ts

⚡ quality-orchestrator  ·  /qo stub <file>  ·  qo analyze-local

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new read-only MCP inspection tool (provar_org_describe) to surface Salesforce object/field describe metadata from the Provar IDE workspace .metadata cache, enabling downstream authoring tools to avoid live SF API calls and rely on on-disk cached schema.

Changes:

  • Implemented provar_org_describe tool with workspace discovery, path-policy enforcement, and JSON/XML/Object cache readers.
  • Registered the tool under the existing inspect tool group in the MCP server.
  • Added unit tests, smoke-script coverage, and documentation updates (MCP reference + pilot guide scenario).

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/mcp/tools/orgDescribeTools.ts New MCP tool implementation: workspace discovery, cache reading/parsing, response shaping, path policy checks.
src/mcp/server.ts Registers org-describe tools under the inspect tool group.
test/unit/mcp/orgDescribeTools.test.ts Unit tests for workspace discovery, cache-miss behavior, path policy, filters, and happy path.
scripts/mcp-smoke.cjs Adds smoke calls for cache-miss and happy-path flows.
docs/mcp.md Adds “Org metadata access” section and provar_org_describe reference docs.
docs/mcp-pilot-guide.md Adds Scenario 13 describing org-aware generation workflow using the new tool.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/mcp/tools/orgDescribeTools.ts Outdated
Comment on lines +145 to +151
fields.push({
name,
type: (f['type'] as string | undefined) ?? 'unknown',
defaultValue: (f['defaultValue'] as string | undefined) ?? null,
// XML defaults: required = !nillable. In the .object format, "required" is rare,
// so we default to nillable=true (optional) unless explicitly required.
nillable: f['required'] === 'true' ? false : true,
Comment thread src/mcp/tools/orgDescribeTools.ts Outdated
Comment on lines +101 to +112
/** Returns the first candidate workspace that exists, or null. */
export function discoverWorkspace(projectPath: string): string | null {
for (const candidate of workspaceCandidates(projectPath)) {
try {
if (fs.existsSync(candidate) && fs.statSync(candidate).isDirectory()) {
return candidate;
}
} catch {
// Permission errors etc. — skip and try next candidate
}
}
return null;
Comment thread src/mcp/tools/orgDescribeTools.ts Outdated
cached = path.extname(file) === '.json' ? readJsonCacheFile(file) : readXmlCacheFile(file);
} catch (e) {
log('warn', 'org_describe: failed to parse cache file', { file, error: (e as Error).message });
return { name: objectName, exists: false, required_fields: [], field_count: 0 };
Comment thread docs/mcp.md Outdated
Comment on lines +1718 to +1724
| Output field | Description |
| -------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- |
| `workspace_path` | Absolute resolved path to the discovered workspace, or `null` when none of the three candidate directories exists. |
| `cache_age_ms` | `mtime` delta in milliseconds of the connection cache directory, or `null` when the cache is missing. |
| `objects[]` | Array of `{ name, exists, required_fields, field_count }`. `exists` is `true` (cached), `false` (requested but not cached), or `null` (cache miss). |
| `details.suggestion` | Present **only** on cache miss. Tells the agent how to populate the cache (open Provar IDE) or how to proceed without it (inline hints). |

Comment on lines +130 to +167
/**
* Parse a legacy .object XML file (CustomObject metadata) into the canonical shape.
* Only extracts the bare minimum the tool needs: field name, type, nillable.
*/
function readXmlCacheFile(filePath: string): CachedObject {
const raw = fs.readFileSync(filePath, 'utf-8');
const parsed = XML_PARSER.parse(raw) as Record<string, unknown>;
const root = (parsed['CustomObject'] ?? parsed['toolingObjectInfo'] ?? {}) as Record<string, unknown>;
const fieldsRaw = root['fields'];
if (!Array.isArray(fieldsRaw)) return { name: path.basename(filePath, path.extname(filePath)), fields: [] };

const fields: CachedField[] = [];
for (const f of fieldsRaw as Array<Record<string, unknown>>) {
const name = (f['fullName'] ?? f['name']) as string | undefined;
if (!name) continue;
fields.push({
name,
type: (f['type'] as string | undefined) ?? 'unknown',
defaultValue: (f['defaultValue'] as string | undefined) ?? null,
// XML defaults: required = !nillable. In the .object format, "required" is rare,
// so we default to nillable=true (optional) unless explicitly required.
nillable: f['required'] === 'true' ? false : true,
});
}
return { name: path.basename(filePath, path.extname(filePath)), fields };
}

/** Look up the cache file for one object, trying .json then .xml. */
function findObjectCacheFile(connectionDir: string, objectName: string): string | null {
const jsonPath = path.join(connectionDir, `${objectName}.json`);
if (fs.existsSync(jsonPath)) return jsonPath;
const xmlPath = path.join(connectionDir, `${objectName}.xml`);
if (fs.existsSync(xmlPath)) return xmlPath;
// Legacy CustomObject layout (.object extension)
const objPath = path.join(connectionDir, `${objectName}.object`);
if (fs.existsSync(objPath)) return objPath;
return null;
}
Comment thread src/mcp/tools/orgDescribeTools.ts Outdated
if (connectionName.includes('/') || connectionName.includes('\\') || connectionName.split(/[/\\]+/).includes('..')) {
throw new PathPolicyError(
'PATH_TRAVERSAL',
`Invalid connection_name (contains path separators): ${connectionName}`
…lag bug, exists-true on parse error, docs/tests

RCA: Copilot review of PR #188 flagged six issues across correctness, security, and contract: (1) the .xml/.object fallback compared required==='true' as a string, but fast-xml-parser with parseTagValue=true (its default) coerces the value to boolean true, silently misclassifying required fields as nillable; (2) discoverWorkspace probed fs.existsSync/statSync against candidate dirs (including the ~/Provar home fallback) BEFORE any path-policy check, contradicting the project's --allowed-paths contract and potentially touching paths outside the policy; (3) when a cache file existed but failed to parse, readObject returned exists=false — indistinguishable from "object not cached", so callers could not detect corrupt/unsupported cache files; (4) docs/examples omitted the requestId field that the tool actually returns, making the documented shape drift from runtime; (5) unit tests covered only the .json cache path, leaving the legacy .xml and .object parsers (where the required-flag bug lived) untested; (6) the PATH_TRAVERSAL message read "contains path separators" but the validator also rejects bare ".." with no separators, so the message was inaccurate for that branch.

Fix: (1) readXmlCacheFile now treats both boolean true and string "true" as required, so nillable is computed correctly regardless of parser config; (2) discoverWorkspace accepts allowedPaths and runs assertPathAllowed per candidate BEFORE fs.existsSync/statSync — a candidate outside policy is silently skipped (not a hard error) so discovery falls through to the next candidate naturally; (3) readObject parse failures now return exists=true with field_count=0 and a per-object error_message describing the parse failure, letting callers distinguish corrupt from missing; (4) docs/mcp.md adds requestId to the output table, adds it to both example responses, documents the new error_message field shape, and adds a third example showing the parse-error response; (5) added (h.1) .xml format test (regression guard for the required-flag bug), (h.2) .object format test, (i) parse-error test asserting exists=true + error_message, and (j) bare ".." connection_name test asserting the broadened message; (6) the PATH_TRAVERSAL message now reads "must not contain path separators or directory-traversal segments ('..')", covering both rejection conditions. 19/19 orgDescribe tests pass, full mocha 1174/1174, yarn lint clean, yarn compile clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator Author

@mrdailey99 mrdailey99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed all 6 review comments in f04eeff:

  1. readXmlCacheFile required-flag: now treats both boolean true and string "true" as required, fixing the silent misclassification under fast-xml-parser's default parseTagValue=true.
  2. discoverWorkspace: takes allowedPaths and runs assertPathAllowed per candidate BEFORE fs.existsSync/statSync. Out-of-policy candidates (including the ~/Provar fallback when home is outside --allowed-paths) are silently skipped, so discovery never touches denied filesystem paths.
  3. Parse failure on a present cache file now returns exists: true with field_count: 0 and a per-object error_message, distinguishing corrupt/unsupported from "object not cached".
  4. docs/mcp.md adds requestId to the output table and to both example responses, documents the new error_message field, and adds a third parse-error response example.
  5. New unit tests added for .xml (regression guard for the required-flag bug), .object, parse-error (asserts exists: true + error_message), and bare .. connection_name (broadened message). 19/19 orgDescribe tests pass; full mocha 1174/1174; yarn lint + yarn compile clean.
  6. PATH_TRAVERSAL message now reads "must not contain path separators or directory-traversal segments ('..')", covering both rejection conditions.

@mrdailey99 mrdailey99 merged commit 39c3b92 into develop May 20, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants