mcp: add ErrorHandler to ServerOptions#865
mcp: add ErrorHandler to ServerOptions#865ravyg wants to merge 5 commits intomodelcontextprotocol:mainfrom
Conversation
Add an optional ErrorHandler callback to ServerOptions for receiving out-of-band errors that occur during server operation. These include keepalive ping failures, notification delivery errors, and internal JSON-RPC protocol errors. When ErrorHandler is nil, errors continue to be logged at the appropriate level using the configured Logger. Fixes modelcontextprotocol#218
|
The issue this is addressing is waiting for community feedback. We're not planning to merge it before there is interest in it. |
|
Maybe we can just use the logger in all the places we currently drop the error. Then it's still reported, but we don't have to add any API. |
Two errors that occur outside the request/response path were previously
unobservable:
- Keepalive ping failures (mcp/shared.go startKeepalive) silently
closed the session with no record of why.
- jsonrpc2 internal errors (mcp/transport.go connect) were printed
via the global log.Printf, bypassing the configured slog.Logger.
Both sites now report through the existing *slog.Logger that Server
and Client already guarantee non-nil via ensureLogger. No new public
API surface; the logger is threaded into the unexported helpers as a
parameter.
Includes a regression test (TestKeepAliveFailure_Logged) that asserts
the keepalive failure produces a log line on the configured logger,
and was verified to fail when the new log call is removed.
Per @jba's suggestion on modelcontextprotocol#865, this is the smaller no-new-API
alternative to the ErrorHandler callback proposal. modelcontextprotocol#865 remains open
pending community input on whether a structured callback is also
wanted.
Refs modelcontextprotocol#218
|
Thanks @jba and @maciej-kisiel for the feedback. Splitting this into two PRs based on @jba's suggestion above:
|
Two errors that occur outside the request/response path were previously
unobservable:
- Keepalive ping failures (mcp/shared.go startKeepalive) silently
closed the session with no record of why.
- jsonrpc2 internal errors (mcp/transport.go connect) were printed
via the global log.Printf, bypassing the configured slog.Logger.
Both sites now report through the existing *slog.Logger that Server
and Client already guarantee non-nil via ensureLogger. No new public
API surface; the logger is threaded into the unexported helpers as a
parameter.
Includes a regression test (TestKeepAliveFailure_Logged) that asserts
the keepalive failure produces a log line on the configured logger,
and was verified to fail when the new log call is removed.
Per @jba's suggestion on modelcontextprotocol#865, this is the smaller no-new-API
alternative to the ErrorHandler callback proposal. modelcontextprotocol#865 remains open
pending community input on whether a structured callback is also
wanted.
Refs modelcontextprotocol#218
## Summary Two error sites in the MCP runtime were previously unobservable: - **`mcp/shared.go` `startKeepalive`** — when a keepalive ping failed, the session was closed silently with no indication of why. - **`mcp/transport.go` `connect`** — `jsonrpc2.OnInternalError` printed via the package-level `log.Printf`, bypassing the `*slog.Logger` configured on `ServerOptions`/`ClientOptions`. Both sites now report through the existing `*slog.Logger` that `Server` and `Client` already guarantee non-nil via `ensureLogger`. The logger is threaded into the unexported `startKeepalive` and `connect` helpers as a parameter — **no new public API surface**. ## Context Per @jba's [suggestion on #865](#865 (comment)): > Maybe we can just use the logger in all the places we currently drop the error. Then it's still reported, but we don't have to add any API. This PR is that smaller alternative. It addresses the concrete observability gap in #218 without expanding `ServerOptions`. PR #865 (the `ErrorHandler` callback proposal) remains open pending community input on whether a structured callback API is also wanted on top of logging. `notifySessions` already logs at `Warn` level on delivery errors and is intentionally left untouched. ## Test plan - [x] New regression test `TestKeepAliveFailure_Logged` asserts the keepalive failure produces a log line on the configured logger. - [x] Verified the test fails when the new `logger.Error(...)` call is removed (sanity check that it pins the behavior). - [x] `go test ./mcp/... -count=1` passes. - [x] `go vet ./...` clean. - [x] `go build ./...` clean. Refs #218 --------- Co-authored-by: Maciej Kisiel <mkisiel@google.com>
# Conflicts: # mcp/client.go # mcp/server.go # mcp/shared.go # mcp/transport.go
|
Closing this out — #887 landed the logger-based fix for #218 that @jba proposed, which resolves the concrete observability gap (keepalive ping failures and jsonrpc2 internal errors are now reported via the existing As I noted on 2026-04-09, I was keeping this open in case a structured If anyone reading this later has a concrete use case where the logger isn't enough (metrics sink, custom alerting, programmatic test hook, etc.), please comment here or open an issue and I'm happy to revisit with a narrower proposal. The branch stays on my fork. Thanks @jba and @maciej-kisiel for steering this toward the simpler design. |
Add an optional ErrorHandler callback to ServerOptions for receiving
out-of-band errors that occur during server operation but are not
associated with a specific request.
Error sources wired to the handler:
When ErrorHandler is nil, existing behavior is preserved — errors are
logged at the appropriate level.
Thanks to @findleyr and @jba for the design discussion on this issue.
Fixes #218