Replace discovery-into-context with direct VMCP calls#5491
Conversation
On the Serve path the core VMCP is the single authoritative, admission- filtered source of truth, so capability aggregation must no longer flow through the discovery-into-context middleware (which applies no admission and would bypass authz now that it lives in the core seam). - Store the injected core on *Server and guard the discovery middleware to the legacy path (s.core == nil); it is left intact, not deleted. - Source session capabilities from core.ListTools/ListResources once at OnRegisterSession and route tool/resource handlers through core.CallTool/ReadResource with explicit identity (serve_handlers.go). - Enforce identity binding on the core call path via a new session-layer ValidateCaller seam (the audited check lives in internal/security). - Resolve audit backend enrichment from Tool.BackendID via core.LookupTool on the Serve path; the legacy context-based path is unchanged. The literal "feed ListTools into ProcessPreQueriedCapabilities" wiring is infeasible (type mismatch; the flat tool list drops the routing table's OriginalCapabilityName), so the core's output is used directly as the session's fixed-at-initialize set and the factory does not re-aggregate. Implements #5442 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Resolve the Serve path's audit BackendName from the backend registry (BackendID -> backend.Name) so it matches the legacy path's WorkloadName instead of recording the raw backend ID; fall back to the ID when the backend is absent from the registry. - Reword enforceSessionBinding to accurately state that the SDK Validate only gates existence/termination and this is the sole binding-enforcement point on the Serve call path. - Read the identity binding via GetMetadataValue to avoid copying the whole metadata map on every Serve-path tool call/resource read. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Document why Serve audit enrichment resolves via the core (correct under conflict resolution) not the session routing table (raw names), and that the per-request re-aggregation is a deferred Serve-path optimization. - Add ValidateCaller fail-closed cases for empty/unparsable bindings. - Log resource_count on Serve session registration; note the OutputSchema marshal failure is non-fatal by design. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5491 +/- ##
==========================================
+ Coverage 69.48% 69.50% +0.02%
==========================================
Files 638 640 +2
Lines 65042 65193 +151
==========================================
+ Hits 45192 45311 +119
- Misses 16529 16551 +22
- Partials 3321 3331 +10 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
tgrunnagle
left a comment
There was a problem hiding this comment.
Multi-agent review — Replace discovery-into-context with direct VMCP calls
A refactor (P2.4 of the core-extraction epic) that adds a guarded Serve path beside the unchanged legacy server.New path. The approach is sound and well-executed: the legacy path is genuinely untouched, dependency direction (server→core) is clean, identity is read once at the transport boundary and passed explicitly (not ctx-coupling), and the security-critical behaviors — fail-closed binding via the reused validateCallerBinding, generic authz-denial messages, audit name-resolution only for caller-reachable capabilities — are correct. The C1 deviation from the issue's literal AC2 is justified and documented (the literal wiring is infeasible because core.ListTools returns a flat aggregated slice that has dropped the per-backend data the factory needs).
No blocking correctness or security regressions. The substantive items are test-coverage gaps on the new handlers and one latent architectural contract; neither is a runtime regression today (the Serve path is test-only, with no production composition root).
Consensus summary
| # | Finding | Severity | Action |
|---|---|---|---|
| F1 | Serve path double-aggregates: factory CreateSession aggregates in addition to core.ListTools; AC2 "factory does not aggregate" is an unenforced/untested contract |
MEDIUM | Fix (doc + test) |
| F2 | coreResourceHandler has zero test coverage |
MEDIUM | Fix |
| F3 | coreToolHandler error/denial paths untested; fakeCore.callErr is a dead field |
MEDIUM | Fix |
| F4 | terminate-on-binding-failure side effect never asserted | MEDIUM | Fix |
| F5 | fakeCore doesn't model admission filtering; "denied" subtest only exercises not-found |
MEDIUM | Fix |
| F6 | No-arg tools/call fails the arguments assertion — parity-preserving, pre-existing in all 4 adapters (not a regression) |
MEDIUM | Follow-up |
| F7 | Resource handler forwards raw core error to the client (info-disclosure asymmetry); parity with legacy | MEDIUM | Follow-up |
| F8 | Cross-pod re-injection lists tools under request identity, not bound identity (specialist severity conflict; adjudicated) | LOW | Adjudicate |
| F9 | Audit LookupTool re-aggregates per audited request (documented deferral, anti-pattern #9) |
LOW | Track |
| F10 | Prompts omitted — confirm SDK doesn't advertise a prompts capability; lock omission with a test | LOW | Fix (cheap) |
Recommendation: COMMENT. 0 HIGH after verification — both agent-flagged HIGHs are non-blocking on inspection (F6 is parity-preserving pre-existing behavior; F8 is bounded by the call-time binding check). The MEDIUM test-coverage gaps (F2–F5) and the double-aggregation contract (F1) are worth addressing before the Serve path is wired into production; F6/F7 belong in separate follow-ups. Inline comments below.
Reviewed against 688e4573. 6 specialist agents (security, concurrency, architecture, MCP/SDK, test, general); codex cross-review skipped (not installed).
Addresses #5491 review comments: - MEDIUM serve_handlers.go (3391826182): cover coreResourceHandler — advertise, route via core.ReadResource, and generic message on ErrAuthorizationFailed - MEDIUM serve_handlers.go (3391826190): table-test coreToolHandler error/denial paths (genericized authz, forwarded error, non-map args); use fakeCore.callErr - MEDIUM serve_handlers.go (3391826195): assert binding failure terminates the session and never reaches the core - MEDIUM backend_enrichment_test.go (3391826200): rename the not-found subtest so it no longer overclaims; note denied collapses to ErrNotFound at the core - LOW serve_handlers.go (3391826230): lock the prompts omission with a test Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Addresses #5491 review comments: - MEDIUM server.go (3391826178): document that the Serve-path session factory must be built without an aggregator (AC2 no-double-aggregation contract, on ServerConfig.SessionManagerConfig) and qualify the registration comment to "single core aggregation per session" - MEDIUM serve_handlers.go (3391826209): use errors.New for the static read-denied message (no %w directive) - LOW server.go (3391826221): note the call-time binding check backstops the cross-pod re-injection listing under the request identity Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
Large PR justification has been provided. Thank you!
|
Re: `pkg/vmcp/session/factory.go:147` (
// TODO(#5445): Remove WithAggregator and the aggregator field once server.New is
// routed through Serve. On the Serve path the core is the single aggregator; a
// factory that also aggregates double-aggregates and drifts (AC2). Deleting the
// option makes that contract structurally impossible to violate. |
Summary
P2.4 of the vMCP core-extraction epic (parent story #5431, epic #5419). On the
Servepath the core VMCP is now the single authoritative, admission-filtered source of truth
for the advertised capability set and call routing. Capability aggregation must therefore
stop flowing through the discovery-into-context middleware, which applies no admission and
would bypass authorization now that authz lives in the core admission seam (#5438).
This PR wires the
Servepath through the core's explicit-identity methods and guards thelegacy discovery path to
server.Newonly:*Server; thes.core == nil/s.core != nilvalue is thebranch selector for legacy vs. Serve throughout the server. The discovery middleware is
guarded to the legacy path (
s.core == nil), left intact rather than deleted — physicalremoval is Phase 3 (P3.2 Reduce server.New body to the wrapper #5445).
core.ListTools/core.ListResourcesonce atOnRegisterSession; route tool/resource invocations throughcore.CallTool/core.ReadResourcewith an explicit*auth.Identity(newserve_handlers.go).session.ValidateCallerseam (audited check stays inpkg/vmcp/session/internal/security). Requests reach the core directly, bypassing theBindSessiondecorator, so this is the sole binding-enforcement point on the Serve path.Tool.BackendID→backend.Nameviacore.LookupTool+ the registry, for parity with the legacy path'sWorkloadName. Thelegacy context-based enrichment path is unchanged.
Closes #5442
Type of change
Test plan
task test)task lint-fix)task buildpasses. Changed-package unit tests pass under the race detector:go test -race ./pkg/vmcp/server/... ./pkg/vmcp/session/..., including the new Serve-pathtests: discovery-guard branch coverage,
tools/callrouted through the core, theonce-per-session core call, identity-binding enforcement on the call path, audit-enrichment
parity (
BackendNameresolved fromBackendID), andValidateCallerfail-closed cases.task lint-fixis clean except a pre-existing, unrelated gosec G115 incmd/thv/app/upgrade.go(untouched by this PR).Note on
task test: two unrelated environment failures, neither touched by this PR —Docker-dependent tests ("no available runtime") and the flaky
TestTransparentProxy_*tests in
pkg/transport/proxy/transparent(pass in isolation). The packages this PRchanges pass.
Changes
pkg/vmcp/server/serve_handlers.goinjectCoreSessionCapabilitiessources tools/resources from the core once per session;coreToolHandler/coreResourceHandlerroute throughcore.CallTool/core.ReadResource;enforceSessionBindingis the sole identity-binding check on the Serve call path.pkg/vmcp/server/server.gocoreon*Server; guard the discovery middleware application tos.core == nil; branchhandleSessionRegistrationImplandlazyInjectSessionTools(cross-pod re-injection) onto the core on the Serve path.pkg/vmcp/server/backend_enrichment.gocore.LookupTool/LookupResource/LookupPromptandTool.BackendID, then maps tobackend.Namevia the registry. Legacy context-based resolution unchanged.pkg/vmcp/server/serve.gosrv.core; update theServecontract docs (the "/" MCP route is now serveable because the shared Handler skips discovery whens.core != nil).pkg/vmcp/session/session.goValidateCallerseam delegating to the internal security package.pkg/vmcp/session/internal/security/security.govalidateCallerBindingfree function so the same audited check backs both the decorator and the exportedValidateCaller.Does this introduce a user-facing change?
No. The legacy
server.Newpath is unchanged; the Serve path is not yet routed inproduction (no composition root until a later phase).
Implementation plan
Approved implementation plan
The issue's literal AC2 prescribes feeding
core.ListToolsinto the session factory asthe
ProcessPreQueriedCapabilitiesinput. Planning analysis found this infeasible:ProcessPreQueriedCapabilitiestakes per-backend raw tools, whilecore.ListToolsreturns a flat, already-aggregated slice.
BackendTarget.OriginalCapabilityName, which the factory needs forbackend-name translation.
The approved alternative ("C1"), discussed with and approved by the issue author/owner
before implementation:
fixed-at-initialize capability set; the factory does not re-aggregate on the Serve path.
core.ListTools/core.ListResourcesare called once per session atOnRegisterSession.tools/call/resources/readthrough
core.CallTool/core.ReadResourcewith an explicit*auth.Identityread atthe transport boundary (never from context inside the core).
if s.core == nil,leaving the middleware,
handleSubsequentRequest, and the context seam in place for thelegacy path. Physical removal is deferred to Phase 3 (P3.2 Reduce server.New body to the wrapper #5445).
LookupTool. Derive the backend fromTool.BackendID(via thecore's admission-filtered
Lookup*) →backend.Namevia the registry, matching thelegacy
WorkloadName. This re-aggregates per audited request; documented and deferredper anti-pattern Use an actual logger #9 (no production composition root yet).
ValidateCallerbinding seam. Add an exportedsession.ValidateCallerso the Servepath can enforce identity binding without the
BindSessiondecorator; the audited checkstays in
pkg/vmcp/session/internal/security.Special notes for reviewers
for
core.ListToolsto feed the factory as theProcessPreQueriedCapabilitiesinput.That is infeasible:
ProcessPreQueriedCapabilitiesconsumes per-backend raw tools, butcore.ListToolsreturns a flat aggregated slice, and that flat slice dropsBackendTarget.OriginalCapabilityNameneeded for backend-name translation. So the core'soutput is used directly as the SDK session's fixed-at-initialize set and the factory does
not re-aggregate on the Serve path. This was discussed with and approved by the issue
author/owner before implementation.
LookupToolre-aggregatesper audited request (the core is intentionally stateless). This is documented inline and
deferred per vMCP anti-pattern Use an actual logger #9 (optimize with evidence): the Serve path has no
production composition root yet. The fix when the path goes live is a per-session
advertised-set cache ("Serve caches, core is stateless").
enforceSessionBindingfails closed in two ways — a missing session record and anempty/unparsable binding both reject the caller, and any rejection terminates the session.
This is intentionally stricter than the legacy
GetAdaptedToolshandler.handleSubsequentRequestinjection, and theDiscoveredCapabilitiescontext seam are deliberately left in place for the legacy path;physical removal is Phase 3 (P3.2 Reduce server.New body to the wrapper #5445).
server.New's signature and observable behavior areunchanged.
addressed; the final review pass found 0 critical/high/medium findings.
Large PR Justification
The production change is small and cohesive: a single logical refactor (route the Serve path through the core VMCP), ~255 lines of non-comment code across 6 source files — within the ≤400 LOC / ≤10 file guideline.
The ~1085-line total is dominated by:
Per CONTRIBUTING/CLAUDE.md, large PRs are acceptable for test-heavy changes. This is one atomic refactor that should not be split from its tests.