Skip to content

feat(056): output-schema validation for proxied tool calls (Spec 054 Track A)#525

Open
Dumbris wants to merge 3 commits into
mainfrom
056-output-schema-validation
Open

feat(056): output-schema validation for proxied tool calls (Spec 054 Track A)#525
Dumbris wants to merge 3 commits into
mainfrom
056-output-schema-validation

Conversation

@Dumbris
Copy link
Copy Markdown
Member

@Dumbris Dumbris commented May 25, 2026

Summary

Implements Track A of the Spec 054 MCP-security-gateway umbrella, carved into its own feature (Spec 056). When an upstream tool declares an outputSchema, mcpproxy now validates the tool's structured response against that schema at the proxy boundary before it reaches the agent — closing the only completely-empty axis of the security story ("validated data out"). A buggy or compromised server can no longer inject malformed/oversized/unexpected structured data into the agent's context.

Spec/plan/tasks: specs/056-output-schema-validation/.

Behaviour (backward-compatible by default)

mode violating structuredContent conforming no schema / text-only / error result
off forward forward forward
warn (default) forward + policy_decision audit forward unchanged forward
strict block (error result) + audit forward unchanged forward (text-only governed by missing_structured_content)
  • Lossless on success — conforming structuredContent is forwarded byte-for-byte (validate a read-only view, never strip-then-validate).
  • Cheap byte-size + nesting-depth guards run before schema compilation (DoS bound).
  • An uncompilable tool schema degrades to a no-op (logged once) — never blocks traffic.
  • Honors the ContextForge #4042 trap: declared schema + text-only response does not hard-fail in warn.

Key changes

  • internal/outputvalidation — new pure package: Validator with a per-tool compiled-schema sync.Map cache (santhosh-tekuri/jsonschema/v6), guards, Verdict. No server/storage deps.
  • internal/configOutputValidationConfig (mode/max_bytes/max_depth/missing_structured_content, nil-safe helpers, default warn) + ToolMetadata.OutputSchemaJSON.
  • internal/upstream/core/client.go — capture RawOutputSchema/OutputSchema at discovery (FR-A1).
  • internal/runtime/{stateview,supervisor} — propagate OutputSchemaJSON onto the in-memory snapshot for cheap call-time lookup (no per-call index query).
  • internal/serverapplyOutputValidation wired into both handleCallToolVariant forward sites; pure evaluateOutputValidation decision core; reuses emitActivityPolicyDecision. Identical in personal + server editions.
  • OpenAPI spec regenerated for the new config model.

Design note: validation runs in mcp.go on forwardContentResult's output (its StructuredContent is untouched by truncation) rather than inside forwardContentResult, keeping that function pure. So the test coverage lives in internal/server/output_validation_test.go rather than a content_forward_test.go.

Testing

  • Unit: internal/outputvalidation (19, validator + guards, -race); internal/config (8); internal/server output-validation (11 — every decision branch incl. #4042 trap, guard breach, uncompilable schema). golangci-lint clean on new files; both editions build.
  • E2E (curl + CLI, fresh data-dir, committed stub MCP server e2e/stubs/outputschema):
    • strict → bad_output blocked: output schema validation failed: ... at '/id': got string, want integer, with a blocked policy_decision record (verified via GET /api/v1/activity?type=policy_decision)
    • strict → conforming passes; text_only (no structured, allow) passes
    • warn → bad_output forwarded unchanged ({"id":"not-an-int"}, isError:false) + a warning policy_decision record

Out of scope (other Spec 054 tracks)

Output sanitisation (B), per-tool ACLs (C), TOFU pinning of schemas/annotations (D), audit hash chain (E).

Related #521

Dumbris added 3 commits May 25, 2026 21:58
Related #521

Carve Track A of the Spec 054 security-gateway umbrella into its own
feature: validate a tool's structuredContent against its declared
outputSchema at the proxy boundary before it reaches the agent.

## Changes
- spec.md: FR-A1..A12, 3 prioritized user stories, edge cases, success criteria
- plan.md: pure internal/outputvalidation pkg + forwardContentResult hook design
- research.md: santhosh-tekuri/jsonschema/v6, capture point, modes, cache decisions
- data-model.md, contracts/validator.md, quickstart.md
- tasks.md: 24 TDD-first tasks organized by user story
…Track A)

Related #521

Validate a tool's structured response against its declared outputSchema at
the proxy boundary before it reaches the agent, so a buggy or compromised
upstream cannot inject malformed/oversized/unexpected data into the agent's
context. Track A of the Spec 054 security-gateway umbrella.

## Changes
- internal/outputvalidation: new pure package — Validator with a per-tool
  compiled-schema sync.Map cache (santhosh-tekuri/jsonschema/v6), byte-size
  and nesting-depth guards run before validation, uncompilable schemas degrade
  to a no-op. Never mutates the payload.
- internal/config: OutputValidationConfig (mode off/warn/strict, default warn;
  max_bytes; max_depth; missing_structured_content) + ToolMetadata.OutputSchemaJSON.
- internal/upstream/core/client.go: capture tool.RawOutputSchema/OutputSchema
  at discovery into ToolMetadata.OutputSchemaJSON (FR-A1).
- internal/runtime/{stateview,supervisor}: propagate OutputSchemaJSON onto the
  in-memory ToolInfo snapshot for cheap call-time lookup.
- internal/server: applyOutputValidation wired into both handleCallToolVariant
  forward sites; pure evaluateOutputValidation decision core; strict blocks
  with an error result, warn forwards + records a policy_decision audit entry
  (reuses emitActivityPolicyDecision). No build-tag-specific behaviour.
- promote santhosh-tekuri/jsonschema/v6 to a direct dependency.
- docs/features/output-schema-validation.md; e2e stub MCP server.

Design note: validation runs in mcp.go on forwardContentResult's output
(StructuredContent is unaffected by truncation) rather than inside
forwardContentResult, keeping that function pure.

## Testing
- Unit: internal/outputvalidation (19 tests, validator + guards, -race);
  internal/config (8 tests); internal/server output_validation (11 tests
  covering every decision branch incl. ContextForge #4042 trap, guard breach).
- E2E (curl + CLI, fresh data-dir, stub MCP server declaring an outputSchema):
  strict blocks a violating structuredContent with "at '/id': got string,
  want integer" + a blocked policy_decision; conforming passes; text-only
  (no structured) passes under strict+allow; warn mode forwards the violation
  unchanged + a warning policy_decision. Both editions build.
Related #521

make swagger-verify regenerates oas/ from struct annotations; the new
config.OutputValidationConfig model and the output_validation field on the
Config schema are now documented.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying mcpproxy-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: 45e0f47
Status: ✅  Deploy successful!
Preview URL: https://10c0f4c3.mcpproxy-docs.pages.dev
Branch Preview URL: https://056-output-schema-validation.mcpproxy-docs.pages.dev

View logs

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 25, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 61.53846% with 95 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/server/mcp.go 37.25% 26 Missing and 6 partials ⚠️
e2e/stubs/outputschema/main.go 0.00% 26 Missing ⚠️
internal/outputvalidation/validator.go 79.76% 13 Missing and 4 partials ⚠️
internal/upstream/core/client.go 0.00% 11 Missing ⚠️
internal/runtime/supervisor/supervisor.go 50.00% 5 Missing ⚠️
internal/outputvalidation/guards.go 80.00% 2 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Copy Markdown

📦 Build Artifacts

Workflow Run: View Run
Branch: 056-output-schema-validation

Available Artifacts

  • archive-darwin-amd64 (27 MB)
  • archive-darwin-arm64 (25 MB)
  • archive-linux-amd64 (16 MB)
  • archive-linux-arm64 (14 MB)
  • archive-windows-amd64 (27 MB)
  • archive-windows-arm64 (24 MB)
  • frontend-dist-pr (0 MB)
  • installer-dmg-darwin-amd64 (21 MB)
  • installer-dmg-darwin-arm64 (18 MB)

How to Download

Option 1: GitHub Web UI (easiest)

  1. Go to the workflow run page linked above
  2. Scroll to the bottom "Artifacts" section
  3. Click on the artifact you want to download

Option 2: GitHub CLI

gh run download 26416222217 --repo smart-mcp-proxy/mcpproxy-go

Note: Artifacts expire in 14 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants