fix: filter direct-mode tools by agent token scope#520
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds discovery-time filtering for direct-mode tools so scoped agent tokens only see tools for allowed servers and permissions, reducing tool metadata disclosure while keeping call-time auth authoritative.
Changes:
- Track required permissions for each direct-mode tool during tool discovery.
- Add a tool filter hook that removes out-of-scope direct-mode tools from
tools/listfor agent tokens. - Add unit tests covering server scoping, permission scoping, fail-closed behavior, and non-direct tool behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| internal/server/mcp_routing.go | Builds a permissions map for direct tools and registers a tool filter for discovery responses. |
| internal/server/mcp_direct_scope.go | Implements permission derivation/storage and the direct-mode tool filter for agent auth scopes. |
| internal/server/mcp.go | Adds mutex-protected state to store direct tool permission requirements. |
| internal/server/mcp_routing_test.go | Adds tests validating the filtering behavior for agent vs non-agent contexts and missing metadata. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func requiredPermissionForDirectTool(annotations *config.ToolAnnotations) string { | ||
| requiredVariant := contracts.DeriveCallWith(annotations) | ||
| requiredPerm := contracts.ToolVariantToOperationType[requiredVariant] | ||
| if requiredPerm == "" { | ||
| return contracts.OperationTypeRead | ||
| } | ||
| return requiredPerm | ||
| } |
| func (p *MCPProxyServer) setDirectToolPermissions(perms map[string]string) { | ||
| p.directToolPermsMu.Lock() | ||
| defer p.directToolPermsMu.Unlock() | ||
|
|
||
| if len(perms) == 0 { | ||
| p.directToolPerms = nil | ||
| return | ||
| } | ||
|
|
||
| p.directToolPerms = perms | ||
| } |
| filtered := tools[:0] | ||
| for _, tool := range tools { | ||
| serverName, _, ok := ParseDirectToolName(tool.Name) | ||
| if !ok { | ||
| filtered = append(filtered, tool) | ||
| continue | ||
| } | ||
|
|
||
| if !authCtx.CanAccessServer(serverName) { | ||
| continue | ||
| } | ||
|
|
||
| requiredPerm, ok := p.lookupDirectToolPermission(tool.Name) | ||
| if !ok { | ||
| continue | ||
| } | ||
|
|
||
| if requiredPerm != "" && !authCtx.HasPermission(requiredPerm) { | ||
| continue | ||
| } | ||
|
|
||
| filtered = append(filtered, tool) | ||
| } | ||
|
|
||
| return filtered |
| serverTools := make([]mcpserver.ServerTool, 0, len(tools)) | ||
| directToolPerms := make(map[string]string, len(tools)) | ||
| for _, tool := range tools { | ||
| directName := FormatDirectToolName(tool.ServerName, tool.Name) | ||
| directToolPerms[directName] = requiredPermissionForDirectTool(tool.Annotations) |
|
The actual review follow-up in commit b769a55 is:
Verification:
|
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Dumbris
left a comment
There was a problem hiding this comment.
Thanks for this — the agent-token scoping half is genuinely solid work. The directToolPerms map under an RWMutex, the defensive copy, the fail-closed-on-missing-metadata path, and the thorough table tests (server scope / permission scope / non-mutation / admin passthrough) are exactly right. Closing the tools/list info-disclosure leak for scoped tokens is a real security win.
I'd like to talk through one thing before we land it, because it touches our core security model.
The PR actually contains two independent changes
- Agent-token scope filtering — the stated purpose. Read-time only, no persistence, no migration. 👍
- Output schema folded into the tool hash (
internal/hash/hash.go,ToolMetadata.OutputSchemaJSON, index,upstream/core) — not mentioned in the description, and it changes on-disk behavior.
These are orthogonal (the filter never touches a hash), so I'd ask that they be split into two PRs regardless of what we decide below — the migration deserves its own title and review so a future bisect of "why did all my tools re-quarantine after upgrade" lands on the right commit.
The contract model this needs to respect
Our tool-quarantine (Spec 032) encodes a specific policy:
- A human approves a server's tool contract — description + schema — once.
- Agents may use only approved tools. Agent tokens cannot self-approve; they're blocked at call time until a human acts.
- If the upstream server later changes a tool's description or schema, that tool is re-quarantined (rug-pull guard) until a human re-approves.
The change-detection hash is the fingerprint of "the contract the human signed off on."
Why this matters for the output-schema change
Two parts, and they pull in opposite directions:
(a) Output schema should be in the hash. Output schema is part of the contract — it describes what the agent receives back. A server silently swapping its output shape is a contract change a human should re-review. So I agree with the intent: keep output schema in the hash. Dropping it would let output-schema rug-pulls slip past review.
(b) But the upgrade itself must not re-quarantine anything. Here's the subtlety: when a user upgrades to this build, the upstream contract is byte-identical — only mcpproxy's hashing algorithm changed. As written, every already-approved tool that ships an output schema will hash differently and flip to changed, demanding human re-approval for tools that did not actually change upstream.
That's a false positive, and it's the dangerous kind: the first time a user sees "all 200 tools quarantined, click Approve-All," they stop reading the diffs. That trains humans to rubber-stamp — which destroys the exact signal the whole policy depends on.
Proposed solution
-
Keep output schema in the hash — it's part of the approvable contract.
-
Add a one-time versioned backfill so the algorithm change never masquerades as a rug-pull. We already have the hook:
CurrentSchemaVersion/SchemaVersionKeyininternal/storage/models.go. On boot, when stored version < N:- walk every
approvedToolApprovalRecord(and the index /ToolHashBuckethashes), - recompute the hash with the new output-schema-inclusive algorithm using the currently observed output schema,
- rewrite
ApprovedHash/CurrentHashwhile preservingStatus = approved. - leave
pending/changedrecords untouched — they still need human action.
Semantics: "the contract this human already approved hasn't changed; we're only re-baselining the fingerprint to include output schema." From that point on, any real drift in description, input schema, or output schema →
changed→ human-only re-approval. That's the policy, with no exception carved out.One honest caveat we should document: tools approved before the upgrade have no prior output-schema baseline, so we can't retroactively detect drift that happened before the migration — the backfill blesses whatever output schema exists at migration time. Unavoidable (no historical baseline exists), and acceptable since the server was already trusted at the contract the human reviewed. The window exists only once, at the upgrade boundary.
- walk every
-
(Follow-on, optional but it falls right out of this PR) "Agents may use only approved tools" implies the agent
tools/listfilter you added should also hidepending/changedtools, not just out-of-scope ones — discovery should match capability, so an agent sees exactly the set it can invoke (in-scope ∧ permitted ∧ approved). Human/admin/OAuth-user contexts still see pending/changed, since they're the ones who act on them.
Suggested shape
- PR A (this one, trimmed): agent-token scope filter only. Mergeable as soon as the full
go test ./internal/server/...run is confirmed green (the description notes only the targeted-runfilters were run locally). - PR B: output schema in hash + the versioned backfill migration + a test proving an approved output-schema tool stays approved across a simulated version bump.
- PR C (optional): extend the agent filter with the approval-status predicate.
If you're good with this direction, I'd be happy to pair on it — I can take the migration in PR B (or push directly to your branch) while you keep ownership of the scoping work. Let me know what you'd prefer. Thanks again — the hard part is already done well.
Summary
Direct mode already enforces agent-token scope at tool call time, but tools/list still exposed every direct-mode upstream tool definition. This meant scoped agent tokens blocked execution, but still disclosed tool names, descriptions, and schemas for servers outside the token scope.
This PR filters direct-mode tools/list responses for agent tokens using the same server and permission semantics already used by direct-mode execution.
Changes
Verification
Note: broader go test ./internal/server and go test ./internal/... attempts exceeded the IDE tool timeout in this local environment before producing Go failures.