Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 26 additions & 3 deletions docs/toolhive/guides-k8s/run-mcp-k8s.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -453,6 +453,26 @@ The proxy runner handles authentication, MCP protocol framing, and session
management; it is stateless with respect to tool execution. The backend runs the
actual MCP server and executes tools.

### Session routing for backend replicas

For stateful MCP servers, once a client establishes a session with a specific
backend pod, subsequent requests in that session need to reach the same pod.
When `backendReplicas > 1` and Redis session storage is configured, the proxy
runner uses Redis to store a session-to-pod mapping so every proxy runner
replica knows which backend pod owns each session. Stateless backends can use
non-sticky routing by setting `sessionAffinity` to `None`.

When Redis session storage is configured and a backend pod is restarted or
replaced, its entry in the Redis routing table is invalidated and the next
request reconnects to an available pod — sessions are not automatically migrated
between pods.

Without Redis session storage, the proxy runner relies on Kubernetes `ClientIP`
session affinity on the backend Service. `ClientIP` affinity is unreliable
behind NAT or shared egress IPs, and there is no shared session-to-pod mapping,
so a pod restart or replacement can silently misroute subsequent requests to the
wrong pod.

Common configurations:

- **Scale only the proxy** (`replicas: N`, omit `backendReplicas`): useful when
Expand Down Expand Up @@ -486,9 +506,12 @@ replica management to an HPA or other external controller.

:::note

The `SessionStorageWarning` condition fires only when `spec.replicas > 1`.
Scaling only the backend (`backendReplicas > 1`) does not trigger a warning, but
backend client-IP affinity is still unreliable behind NAT or shared egress IPs.
The `SessionStorageWarning` condition fires only when `spec.replicas > 1`
(multiple proxy runner pods). It does not fire when only `backendReplicas > 1`,
but `sessionAffinity: ClientIP` is unreliable behind NAT or shared egress IPs.
For stateful workloads or when per-session routing must remain consistent across
backend pods, Redis session storage is strongly recommended when
`backendReplicas > 1`. Stateless workloads can use `sessionAffinity: None`.
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this note you refer to spec.replicas > 1, but later switch to backendReplicas > 1 without the spec. prefix. For consistency with the rest of the doc’s field-path style (and the earlier spec.backendReplicas bullet), consider using spec.backendReplicas > 1 in both occurrences here.

Suggested change
`backendReplicas > 1`. Stateless workloads can use `sessionAffinity: None`.
`spec.backendReplicas > 1`. Stateless workloads can use `sessionAffinity: None`.

Copilot uses AI. Check for mistakes.

:::

Expand Down