Enforce request body size limits on proxies and vMCP#5492
Conversation
The 1MB request-body cap and 413 handling lived only in the management API server, leaving the MCP proxies and vMCP exposed to memory exhaustion from large/unbounded request bodies (GHSA-grwg-v9p7-76m2). Move the middleware and its helpers into a reusable pkg/bodylimit package exposing both a net/http middleware and a runner middleware factory, so every inbound listener can share one trusted implementation. A non-positive limit falls back to the default (zero never means unlimited). The API server now consumes the package with no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wire the body-limit middleware into the proxy middleware chain and the vMCP server so oversized request bodies are rejected with 413 before any handler buffers them via io.ReadAll, closing the DoS exposure in GHSA-grwg-v9p7-76m2 for default deployments. Register bodylimit in the middleware factory and prepend it as the outermost entry in PopulateMiddlewareConfigs so it runs before auth and the MCP parser. Apply the same middleware in the vMCP handler, inside recovery, preserving SSE streams via the response-writer Unwrap/Flush forwarding. Tests cover 413 rejection, the bounded downstream read that guards chain ordering, and SSE survival. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5492 +/- ##
==========================================
+ Coverage 69.51% 69.63% +0.11%
==========================================
Files 639 642 +3
Lines 65092 65397 +305
==========================================
+ Hits 45248 45536 +288
- Misses 16518 16521 +3
- Partials 3326 3340 +14 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
When an oversized body arrives without a Content-Length (chunked), the early Content-Length check is bypassed and the cap is enforced by http.MaxBytesReader inside the handler's io.ReadAll. The streamable and httpsse handlers reported that as 500; map the MaxBytesError to 413 so the documented contract holds on the chunked path too. Add body-limit smoke tests for all three proxy transports (streamable chunked-413 regression guard, httpsse and transparent Content-Length 413) since each mounts the middleware chain independently. Correct the recovery middleware comment: the proxies reverse-apply the chain, so the last-appended entry is innermost, not outermost. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
The embedded OAuth/auth-server routes (notably the unauthenticated POST /oauth/token) are mounted outside the MCP middleware chain, so the proxy/vMCP body-size cap did not reach them — leaving a memory-exhaustion DoS surface bounded only by Go's ParseForm defaults. Wrap EmbeddedAuthServer.Handler() with the body-limit middleware at a 64KB cap (matching the existing DCR limit; OAuth form posts are small). Routes() and RegisterHandlers() derive from this handler, so every consumer (all three proxies and vMCP) inherits the bound in one place. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The body-size cap was only added inside PopulateMiddlewareConfigs, which runs only when MiddlewareConfigs is empty. thv run and the management API pre-populate the chain via WithMiddlewareFromFlags, so Runner.Run took the else branch and never added body-limit — leaving those paths with no request-body cap (the advisory was effectively unfixed for them). Extract an idempotent addBodyLimitMiddleware helper that prepends the cap at index 0 (outermost), call it from PopulateMiddlewareConfigs, and add it as a backstop in Runner.Run after the if/else so every runner-backed path is covered regardless of how the chain was assembled. Body-limit is now "always present, outermost" — the mirror of recovery's innermost. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The standalone thv proxy command builds its own middleware chain and calls NewTransparentProxy directly, bypassing the runner entirely, so it had no request-body cap at all — this was the exact command in the advisory PoC. Prepend the body-limit middleware as the outermost entry. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
readRequestBody swallowed the http.MaxBytesError raised when an oversized chunked (no Content-Length) body tripped the cap, then forwarded an empty body to the backend — the client saw the backend's response instead of a 413. Surface the error and return a synthetic 413 from RoundTrip, matching the streamable and httpsse handlers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Centralize the *http.MaxBytesError detection behind a single bodylimit.IsRequestTooLarge predicate, replacing the identical errors.As blocks in the streamable, httpsse, and transparent handlers. Collapse the transparent proxy's two synthetic *http.Response literals (413 and the pre-existing 503) into one plainResponse helper, and trim a duplicated comment on the auth-server body-cap constant. No behavior change: 413/503 status codes and bodies are byte-identical. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Merge the Content-Length and chunked oversized-body cases in the streamable and transparent body-limit tests into single table-driven tests that start the proxy once and loop over cases, removing duplicated setup. Both code paths (Content-Length early reject and MaxBytesReader 413 handler) remain exercised; no coverage change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Content-Length oversized->413 path lives entirely in bodylimit.Middleware (early reject) and is unit-tested, so per-transport Content-Length tests only re-exercised shared code. Drop them from the streamable and transparent tests, leaving the chunked case — which also proves the middleware is mounted, since the body must flow through it to be capped. Replace the httpsse Content-Length test with a chunked test that establishes a live SSE session and posts an oversized chunked body to the message endpoint, covering the previously-untested IsRequestTooLarge 413 branch in handlePostRequest. Each transport now guards its own chunked path; net coverage improves. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
jhrozek
left a comment
There was a problem hiding this comment.
Stepping back from the line level: this is a solid, worthwhile change. It closes a real unauthenticated memory-exhaustion gap — before this, only the management API capped request bodies, while the three MCP proxy transports, vMCP, thv proxy, and the auth server's non-DCR endpoints all read bodies via io.ReadAll with no bound. Consolidating the logic into pkg/bodylimit (and deleting the pkg/api copy) is a net simplification, the "always present, outermost" invariant is a clean model, and the chunked/no-Content-Length handling (surfacing IsRequestTooLarge as 413 rather than 500 at each io.ReadAll site) shows the bypass case was thought through. The SSE preservation (Unwrap() for SetWriteDeadline, Flush() forwarding) is a subtle correctness detail that's easy to miss and was handled deliberately. Build + tests pass and the per-transport tests exercise the real 413 path.
The one question that I think actually gates the design is the inline comment on the 1 MB cap — flagged below. Everything else is minor and non-blocking; listing here rather than as separate inline threads:
- Duplicate constant (
pkg/api/server.go:69):maxRequestBodySize = 1 << 20now duplicatesbodylimit.DefaultMaxRequestBodySize. They agree today but can silently drift; consider referencing the package constant directly. (Coupled to the inline question — if the cap becomes configurable, this value should come from the same source.) - Positional idempotency guard (
pkg/runner/middleware.go:323): the guard checks onlymiddlewares[0]. Safe today since every builder prepends via this helper, but a future builder that appends body-limit at a non-zero index would get a second copy. A membership check (slices.ContainsFunc) would make the "single instance, outermost" contract defensive rather than convention-dependent. - MCP parser error path (
pkg/mcp/parser.go:87, outside this diff): on anio.ReadAllerror it forwards tonextinstead of checkingIsRequestTooLarge. End-to-end result is still 413 via the response-writer rewrite, but the path relies on that rewrite rather than handling it explicitly like the proxies now do. Possibly worth a follow-up — not this PR. - Test readiness wait (
pkg/transport/proxy/streamable/body_limit_test.go:27):waitForReadyhand-rolls atime.Sleeppoll loop where the httpsse sibling usesrequire.Eventually. Minor consistency nit (your own earlier commit replaced fixed sleeps with readiness waits).
Nothing above blocks merge in my view — the only thing I'd want an explicit answer on is the limit value/configurability.
1MB was too low for legitimate MCP tools/call payloads that carry large inline content (e.g. base64 images or documents). 8MB accommodates those while still bounding per-request memory. Caps requests only; responses are unaffected. The management API (1MB) and auth-server (64KB) caps are intentionally unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolve conflict in transparent_proxy.go RoundTrip: keep this branch's outbound debug log and adopt upstream's readRequestBody error return with the 413 oversized-body rejection (stacklok#5492). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Summary
ToolHive's 1MB request-body cap (and 413 handling) existed only on the management API server, so the MCP proxies and the vMCP server could be driven to memory exhaustion by large or unbounded request bodies via
io.ReadAll(pkg/mcp/parser.go, the three proxy handlers, vMCP) was unbounded.This PR extracts the existing, trusted body-limit logic into a reusable
pkg/bodylimitpackage and applies it to all inbound listeners, rejecting oversized bodies with413before any handler buffers them. The default cap is 1MB; making it configurable (CLI flag /RunConfig/ operator CRD) is intentionally deferred to a follow-up PR, as is the slow-uploadReadTimeoutmitigation (tracked in #5499).Medium level
pkg/bodylimitpackage: holds the body-limit middleware (Content-Length early-reject +http.MaxBytesReadersafety net + 413) moved out ofpkg/api/server.go. Exposes a plainnet/httpmiddleware (Middleware) for the API server and vMCP, and a runner middleware factory (CreateMiddleware/MiddlewareType) for the proxies. A non-positive limit falls back to the default — zero never means "unlimited".bodylimitis registered inGetSupportedMiddlewareFactoriesand prepended as the outermost entry inPopulateMiddlewareConfigs, so it runs before auth and the MCP parser for all three transports (streamable, httpsse, transparent) via the shared chain.Handler()insiderecovery, before the MCP parser. The response-writer wrapper gainedUnwrap()/Flush()so wrapping does not break long-lived SSE streams (their write-deadline clearing goes throughhttp.ResponseController).pkg/bodylimitwith no behavior change.Low level
pkg/bodylimit/middleware.goMiddleware,CreateMiddleware,MiddlewareType,DefaultMaxRequestBodySize(1MB),MiddlewareParams, and the movedmaxBytesTracker/bodySizeResponseWriterhelpers (now withUnwrap/Flushfor SSE safety).pkg/api/server.gobodylimit.Middleware.pkg/api/request_size_test.gopkg/bodylimit/middleware_test.go.pkg/runner/middleware.gobodylimitfactory; prepend it first (outermost) inPopulateMiddlewareConfigs.pkg/vmcp/server/server.gobodylimit.Middlewareinsiderecovery.pkg/bodylimit/middleware_test.go,pkg/runner/middleware_test.go,pkg/vmcp/server/body_limit_test.go,pkg/transport/proxy/streamable/body_limit_test.goType of change
Test plan
task testpasses (full unit suite, exit 0)task buildpassesCreateMiddlewarefactory.TestHTTPProxy_BodyLimitBoundsDownstreamReadersends a chunked oversized body and asserts a parser-like downstream reader sees<= limitbytes — fails if body-limit is ever moved after the parser.TestPopulateMiddlewareConfigs_BodyLimitIsOutermostasserts body-limit is index 0 (before auth/parser).TestHandler_RejectsOversizedBody) and SSE survival (TestIntegration_SSEGetConnectionSurvivesWriteTimeoutstill passes with the wrapper in place).Does this introduce a user-facing change?
Yes. Requests to the MCP proxies and vMCP with a body larger than 1MB are now rejected with
413 Request Entity Too Large(previously accepted and buffered). The limit is not yet configurable — a follow-up PR adds CLI/CRD configuration. Deployments sending legitimately large MCP messages (e.g. big base64 payloads) should be aware of the 1MB default.Special notes for reviewers
recoverymiddleware claiming last-appended is "outermost" is misleading; it is actually innermost). The bounded-downstream-read test is the ground-truth guard.PopulateMiddlewareConfigschain and reverse-apply loop, so the streamable proxy test plus the runner-level ordering test cover the mechanism for all three. Happy to add per-transport tests if reviewers prefer strict parity.RunConfig/ operator CRD), and the slow-uploadReadTimeoutmitigation (tracked in Add request read timeouts to MCP proxy servers #5499) (deferred pending SSE-streaming verification).Large PR Justification
This PR exceeds 1000 lines mostly because of tests; the production footprint is small and stays within the repo's code-size budget (CONTRIBUTING counts production code only — tests/docs/generated are exempt).
pkg/bodylimitpackage. Splitting would ship a partially-protected fix that leaves some listeners exposed.pkg/bodylimitis extracted frompkg/api/server.go; the move and the API server's switch to the shared package must land together to keep the build green. Most ofpkg/bodylimit/middleware.gois moved code, not net-new.