From b802ede868087d7485f97d13fbfca3af46de6705 Mon Sep 17 00:00:00 2001 From: Yolanda Robla Date: Tue, 14 Apr 2026 15:09:41 +0200 Subject: [PATCH 1/3] Add horizontal scaling Redis session storage guide MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The existing redis-session-storage page only covered the embedded auth server's Redis setup (Sentinel + ACL). Users scaling MCPServer or VirtualMCPServer horizontally were sent to this page but the auth model is completely different — sessionStorage.passwordRef uses a simple password with no ACL username. Changes: - Restructure redis-session-storage.mdx into two named sections: "Embedded auth server session storage" (existing content) and "Horizontal scaling session storage" (new) - New section covers: standalone Redis Deployment/Service/Secret manifests, full MCPServer and VirtualMCPServer sessionStorage examples, verification steps (SessionStorageWarning condition, cross-pod session test), and a "Sharing a Redis instance" guide using keyPrefix to multiplex use cases - Update run-mcp-k8s.mdx and scaling-and-performance.mdx to link directly to the new #horizontal-scaling-session-storage anchor Closes #707 Co-Authored-By: Claude Sonnet 4.6 --- .../guides-k8s/redis-session-storage.mdx | 351 +++++++++++++++++- docs/toolhive/guides-k8s/run-mcp-k8s.mdx | 7 +- .../guides-vmcp/scaling-and-performance.mdx | 3 +- 3 files changed, 343 insertions(+), 18 deletions(-) diff --git a/docs/toolhive/guides-k8s/redis-session-storage.mdx b/docs/toolhive/guides-k8s/redis-session-storage.mdx index 2b4399ec..6216fbf9 100644 --- a/docs/toolhive/guides-k8s/redis-session-storage.mdx +++ b/docs/toolhive/guides-k8s/redis-session-storage.mdx @@ -1,24 +1,44 @@ --- -title: Redis Sentinel session storage +title: Redis session storage description: - How to deploy Redis Sentinel and configure persistent session storage for the - ToolHive embedded authorization server and horizontal scaling. + How to deploy Redis Sentinel with persistent storage for the ToolHive embedded + authorization server, and a standalone Redis instance for horizontal scaling + of MCPServer and VirtualMCPServer. --- -Deploy Redis Sentinel and configure it as the session storage backend for the -ToolHive [embedded authorization server](../concepts/embedded-auth-server.mdx). -By default, sessions are stored in memory, which means upstream tokens are lost +ToolHive uses Redis in several places. Two of them use different configuration +models and are covered on this page: + +- **Embedded authorization server sessions** — stores upstream tokens so users + don't need to re-authenticate after pod restarts. Uses Redis Sentinel with + ACL-based authentication and a fixed `thv:auth:*` key pattern. See + [Embedded auth server session storage](#embedded-auth-server-session-storage). + +- **MCPServer and VirtualMCPServer horizontal scaling** — shares MCP session + state across pod replicas so any pod can handle any request. Uses a standalone + Redis instance with a simple password. Session data is not persisted to disk; + if the Redis pod restarts, active sessions are lost and clients must + reconnect. See + [Horizontal scaling session storage](#horizontal-scaling-session-storage). + +Redis is also required for [rate limiting](./rate-limiting.mdx), which stores +token bucket counters independently of session data. + +You can reuse the same Redis instance for all three purposes by using different +`keyPrefix` values or different databases — see +[Sharing a Redis instance](#sharing-a-redis-instance) for details. + +--- + +## Embedded auth server session storage + +Configure Redis Sentinel as the session storage backend for the ToolHive +[embedded authorization server](../concepts/embedded-auth-server.mdx). By +default, sessions are stored in memory, which means upstream tokens are lost when pods restart and users must re-authenticate. Redis Sentinel provides persistent storage with automatic master discovery, ACL-based access control, and optional failover when replicas are configured. -Redis is also required as the backend for [rate limiting](./rate-limiting.mdx), -which stores token bucket counters in Redis independently of session data. It is -also required for horizontal scaling when running multiple -[MCPServer](./run-mcp-k8s.mdx#horizontal-scaling) or -[VirtualMCPServer](../guides-vmcp/scaling-and-performance.mdx#session-storage-for-multi-replica-deployments) -replicas, so that sessions are shared across pods. - :::info[Prerequisites] Before you begin, ensure you have: @@ -604,6 +624,306 @@ session storage is working correctly. +--- + +## Horizontal scaling session storage + +When you run multiple replicas of an `MCPServer` proxy runner or a +`VirtualMCPServer`, MCP sessions must be shared across pods so that any replica +can handle any client request. ToolHive stores this session state in Redis using +a simple password — no ACL user, no Sentinel. + +### Deploy a standalone Redis instance + +A single Redis pod with a password is sufficient for sharing session state +across replicas during normal operation. The manifests below create Redis in the +`toolhive-system` namespace alongside your ToolHive workloads. + +:::note[Session durability] + +This deployment keeps session state in memory only. If the Redis pod restarts or +is rescheduled, all active sessions are lost and MCP clients must reconnect. For +production deployments where session continuity across Redis restarts is +required, replace the `Deployment` with a `StatefulSet` and add a +`volumeClaimTemplates` entry to persist Redis data to a PVC. + +::: + +:::tip[Generate a strong password] + +```bash +openssl rand -base64 32 +``` + +::: + +```yaml title="redis-scaling.yaml" +# --- Redis password Secret +apiVersion: v1 +kind: Secret +metadata: + name: redis-password + namespace: toolhive-system +type: Opaque +stringData: + # highlight-next-line + password: YOUR_REDIS_PASSWORD +--- +# --- Redis Service +apiVersion: v1 +kind: Service +metadata: + name: redis + namespace: toolhive-system +spec: + selector: + app: redis + ports: + - name: redis + port: 6379 + targetPort: 6379 +--- +# --- Redis Deployment +apiVersion: apps/v1 +kind: Deployment +metadata: + name: redis + namespace: toolhive-system +spec: + replicas: 1 + selector: + matchLabels: + app: redis + template: + metadata: + labels: + app: redis + spec: + containers: + - name: redis + image: redis:7-alpine + args: + - redis-server + - --requirepass + - $(REDIS_PASSWORD) + env: + - name: REDIS_PASSWORD + valueFrom: + secretKeyRef: + name: redis-password + key: password + ports: + - containerPort: 6379 + readinessProbe: + exec: + command: ['sh', '-c', 'redis-cli -a "$REDIS_PASSWORD" PING'] + initialDelaySeconds: 5 + periodSeconds: 5 + resources: + requests: + cpu: 100m + memory: 128Mi + limits: + cpu: 500m + memory: 256Mi +``` + +Apply the manifests: + +```bash +kubectl apply -f redis-scaling.yaml +kubectl wait --for=condition=available deployment/redis \ + --namespace toolhive-system --timeout=120s +``` + +### Configure MCPServer session storage + +Reference the Redis Service and Secret in your `MCPServer` spec: + +```yaml title="mcp-server-with-redis.yaml" +apiVersion: toolhive.stacklok.dev/v1alpha1 +kind: MCPServer +metadata: + name: my-server + namespace: toolhive-system +spec: + image: ghcr.io/example/my-mcp-server:latest + # highlight-start + replicas: 2 + sessionStorage: + provider: redis + address: redis.toolhive-system.svc.cluster.local:6379 + db: 0 + keyPrefix: mcp-sessions + passwordRef: + name: redis-password + key: password + # highlight-end +``` + +### Configure VirtualMCPServer session storage + +The `sessionStorage` field is identical for `VirtualMCPServer`: + +```yaml title="vmcp-with-redis.yaml" +apiVersion: toolhive.stacklok.dev/v1alpha1 +kind: VirtualMCPServer +metadata: + name: my-vmcp + namespace: toolhive-system +spec: + # highlight-start + replicas: 2 + sessionStorage: + provider: redis + address: redis.toolhive-system.svc.cluster.local:6379 + db: 0 + keyPrefix: vmcp-sessions + passwordRef: + name: redis-password + key: password + # highlight-end + config: + groupRef: my-group + incomingAuth: + type: anonymous +``` + +### Verify session storage is working + +After applying your configuration, check that ToolHive has connected to Redis +successfully. + +**Check the `SessionStorageWarning` condition:** + +```bash +kubectl describe mcpserver my-server -n toolhive-system +``` + +When Redis is properly configured, the `SessionStorageWarning` condition is +absent or set to `False`: + +``` +Conditions: + Type: Ready + Status: True + ... + Type: SessionStorageWarning + Status: False + Reason: SessionStorageConfigured +``` + +If `SessionStorageWarning` is `True`, Redis is not configured or the +configuration is invalid. Check the proxy runner pod logs: + +```bash +kubectl logs -n toolhive-system \ + -l app.kubernetes.io/name=my-server \ + | grep -i "redis\|session" +``` + +**Test cross-pod session reconstruction:** + +Scale down to one replica and connect an MCP client to establish a session. Then +scale back up and delete the original pod. Deleting the pod terminates the TCP +connection, so your client will need to reconnect — but if Redis session storage +is working, the session state is preserved in Redis and the client can resume +making requests without reinitializing. + +:::note + +If the Service has `sessionAffinity: ClientIP` configured, the load balancer may +route your reconnect back to the same pod. Delete the original pod first to +force traffic to the new replica. + +::: + +```bash +# Start with 1 replica +kubectl scale deployment vmcp-my-vmcp -n toolhive-system --replicas=1 + +# Connect your MCP client and establish a session, then scale up: +kubectl scale deployment vmcp-my-vmcp -n toolhive-system --replicas=2 + +# Delete the original pod to force routing to the new replica +kubectl delete pod -n toolhive-system \ + -l app.kubernetes.io/name=my-vmcp --field-selector='status.podIP=' + +# Reconnect your MCP client — it should resume the session without reinitializing +``` + +### Sharing a Redis instance + +You can reuse the same Redis instance for embedded auth server sessions, +MCPServer scaling, and VirtualMCPServer scaling by using different `keyPrefix` +values per use case. If you share an instance, use the Redis Sentinel +StatefulSet from the [embedded auth server section](#deploy-redis-sentinel), +which has persistent storage. The standalone `Deployment` from the scaling +section is not suitable as a shared instance because it has no persistent +storage. + +The embedded auth server uses `thv:auth:*` by default; set distinct prefixes for +your scaling workloads: + +| Use case | Suggested `keyPrefix` | +| ------------------------ | ----------------------------------- | +| Embedded auth server | `thv:auth` (fixed, set by ToolHive) | +| MCPServer scaling | `mcp-sessions` | +| VirtualMCPServer scaling | `vmcp-sessions` | + +Alternatively, use separate `db` values (Redis databases 0–15) to provide hard +namespace isolation without requiring separate Redis instances. + +### ACL configuration for a shared instance + +The two use cases authenticate differently and require separate ACL entries: + +- The **embedded auth server** connects as the `toolhive-auth` ACL user, + restricted to the `~thv:auth:*` key pattern. +- The **scaling use case** (`SessionStorageConfig`) only supports a + `passwordRef` — no username field — so it always authenticates as the + **default** Redis user. + +To satisfy both on one instance, enable the default user with a password and +restrict it to the scaling key patterns. Add this line to the `users.acl` entry +in the `redis-acl` Secret alongside the existing `toolhive-auth` entry: + +``` +user default on >YOUR_SCALING_PASSWORD ~mcp-sessions:* ~vmcp-sessions:* &* +@all -@dangerous +``` + +Replace `YOUR_SCALING_PASSWORD` with the password you put in the +`redis-password` Secret, and adjust the key patterns to match your `keyPrefix` +values. + +:::note + +`SessionStorageConfig` does not support Sentinel — it uses a direct Redis +address. Point `sessionStorage.address` at the Redis master pod directly rather +than the Sentinel service: + +```yaml +sessionStorage: + provider: redis + address: redis-0.redis.redis.svc.cluster.local:6379 + passwordRef: + name: redis-password + key: password +``` + +::: + +:::warning + +Restricting the default user to specific key patterns (as shown above) prevents +scaling workloads from accidentally reading or writing auth session keys. If you +omit the key restriction, the default user has full access to the entire +keyspace, including `thv:auth:*` tokens. + +::: + +--- + ## Next steps - [Configure token exchange](./token-exchange-k8s.mdx) to let MCP servers @@ -614,5 +934,8 @@ session storage is working correctly. ## Related information - [Set up embedded authorization server authentication](./auth-k8s.mdx#set-up-embedded-authorization-server-authentication) +- [Horizontal scaling for MCPServer](./run-mcp-k8s.mdx#horizontal-scaling) +- [Horizontal scaling for VirtualMCPServer](../guides-vmcp/scaling-and-performance.mdx#session-storage-for-multi-replica-deployments) - [Backend authentication](../concepts/backend-auth.mdx) -- [Kubernetes CRD reference](../reference/crd-spec.md#apiv1alpha1authserverstorageconfig) +- [Kubernetes CRD reference — SessionStorageConfig](../reference/crd-spec.md#apiv1alpha1sessionstorageconfig) +- [Kubernetes CRD reference — auth server storage](../reference/crd-spec.md#apiv1alpha1authserverstorageconfig) diff --git a/docs/toolhive/guides-k8s/run-mcp-k8s.mdx b/docs/toolhive/guides-k8s/run-mcp-k8s.mdx index e0154331..4ec931cf 100644 --- a/docs/toolhive/guides-k8s/run-mcp-k8s.mdx +++ b/docs/toolhive/guides-k8s/run-mcp-k8s.mdx @@ -480,9 +480,10 @@ spec: ``` When running multiple replicas, configure -[Redis session storage](./redis-session-storage.mdx) so that sessions are shared -across pods. If you omit `replicas` or `backendReplicas`, the operator defers -replica management to an HPA or other external controller. +[Redis session storage](./redis-session-storage.mdx#horizontal-scaling-session-storage) +so that sessions are shared across pods. If you omit `replicas` or +`backendReplicas`, the operator defers replica management to an HPA or other +external controller. :::note diff --git a/docs/toolhive/guides-vmcp/scaling-and-performance.mdx b/docs/toolhive/guides-vmcp/scaling-and-performance.mdx index 4a039a82..12e846b6 100644 --- a/docs/toolhive/guides-vmcp/scaling-and-performance.mdx +++ b/docs/toolhive/guides-vmcp/scaling-and-performance.mdx @@ -84,7 +84,8 @@ spec: key: password ``` -See [Redis Sentinel session storage](../guides-k8s/redis-session-storage.mdx) +See +[Horizontal scaling session storage](../guides-k8s/redis-session-storage.mdx#horizontal-scaling-session-storage) for a complete Redis deployment guide. :::warning From 2f0d1cf7c5d8422e2fbce3392c8414bea13cd9b6 Mon Sep 17 00:00:00 2001 From: Yolanda Robla Date: Wed, 15 Apr 2026 09:17:37 +0200 Subject: [PATCH 2/3] changes from review --- .../guides-k8s/redis-session-storage.mdx | 68 +++++++++++-------- docs/toolhive/guides-k8s/run-mcp-k8s.mdx | 13 ++-- .../guides-vmcp/scaling-and-performance.mdx | 10 +-- 3 files changed, 55 insertions(+), 36 deletions(-) diff --git a/docs/toolhive/guides-k8s/redis-session-storage.mdx b/docs/toolhive/guides-k8s/redis-session-storage.mdx index 6216fbf9..e2ecd8ab 100644 --- a/docs/toolhive/guides-k8s/redis-session-storage.mdx +++ b/docs/toolhive/guides-k8s/redis-session-storage.mdx @@ -1,23 +1,22 @@ --- title: Redis session storage description: - How to deploy Redis Sentinel with persistent storage for the ToolHive embedded - authorization server, and a standalone Redis instance for horizontal scaling - of MCPServer and VirtualMCPServer. + Deploy Redis Sentinel for the ToolHive embedded auth server, or a standalone + Redis instance for MCPServer and VirtualMCPServer horizontal scaling. --- -ToolHive uses Redis in several places. Two of them use different configuration -models and are covered on this page: +ToolHive uses Redis for several purposes. This page covers two that require +different configuration: -- **Embedded authorization server sessions** — stores upstream tokens so users +- **Embedded authorization server sessions** - stores upstream tokens so users don't need to re-authenticate after pod restarts. Uses Redis Sentinel with ACL-based authentication and a fixed `thv:auth:*` key pattern. See [Embedded auth server session storage](#embedded-auth-server-session-storage). -- **MCPServer and VirtualMCPServer horizontal scaling** — shares MCP session +- **MCPServer and VirtualMCPServer horizontal scaling** - shares MCP session state across pod replicas so any pod can handle any request. Uses a standalone - Redis instance with a simple password. Session data is not persisted to disk; - if the Redis pod restarts, active sessions are lost and clients must + Redis instance with a simple password. Session data is not persisted to disk. + If the Redis pod restarts, active sessions are lost and clients must reconnect. See [Horizontal scaling session storage](#horizontal-scaling-session-storage). @@ -25,7 +24,7 @@ Redis is also required for [rate limiting](./rate-limiting.mdx), which stores token bucket counters independently of session data. You can reuse the same Redis instance for all three purposes by using different -`keyPrefix` values or different databases — see +`keyPrefix` values or different databases - see [Sharing a Redis instance](#sharing-a-redis-instance) for details. --- @@ -631,7 +630,7 @@ session storage is working correctly. When you run multiple replicas of an `MCPServer` proxy runner or a `VirtualMCPServer`, MCP sessions must be shared across pods so that any replica can handle any client request. ToolHive stores this session state in Redis using -a simple password — no ACL user, no Sentinel. +a simple password. No ACL user configuration or Sentinel is required. ### Deploy a standalone Redis instance @@ -641,9 +640,10 @@ across replicas during normal operation. The manifests below create Redis in the :::note[Session durability] -This deployment keeps session state in memory only. If the Redis pod restarts or -is rescheduled, all active sessions are lost and MCP clients must reconnect. For -production deployments where session continuity across Redis restarts is +This deployment uses ephemeral storage with no PVC. If the Redis pod is +rescheduled to another node, all active sessions are lost and MCP clients must +reconnect. Sessions may also be lost on pod restart depending on Redis +persistence settings. For production deployments where session continuity is required, replace the `Deployment` with a `StatefulSet` and add a `volumeClaimTemplates` entry to persist Redis data to a PVC. @@ -800,8 +800,8 @@ successfully. kubectl describe mcpserver my-server -n toolhive-system ``` -When Redis is properly configured, the `SessionStorageWarning` condition is -absent or set to `False`: +When Redis is properly configured, the `SessionStorageWarning` condition is set +to `False`: ``` Conditions: @@ -826,9 +826,9 @@ kubectl logs -n toolhive-system \ Scale down to one replica and connect an MCP client to establish a session. Then scale back up and delete the original pod. Deleting the pod terminates the TCP -connection, so your client will need to reconnect — but if Redis session storage -is working, the session state is preserved in Redis and the client can resume -making requests without reinitializing. +connection, so your client will need to reconnect. If Redis session storage is +working, the session state is preserved and the client can resume making +requests without reinitializing. :::note @@ -852,7 +852,9 @@ kubectl delete pod -n toolhive-system \ # Reconnect your MCP client — it should resume the session without reinitializing ``` -### Sharing a Redis instance +--- + +## Sharing a Redis instance You can reuse the same Redis instance for embedded auth server sessions, MCPServer scaling, and VirtualMCPServer scaling by using different `keyPrefix` @@ -871,17 +873,17 @@ your scaling workloads: | MCPServer scaling | `mcp-sessions` | | VirtualMCPServer scaling | `vmcp-sessions` | -Alternatively, use separate `db` values (Redis databases 0–15) to provide hard +Alternatively, use separate `db` values (Redis databases 0-15) to provide hard namespace isolation without requiring separate Redis instances. -### ACL configuration for a shared instance +#### ACL configuration for a shared instance The two use cases authenticate differently and require separate ACL entries: - The **embedded auth server** connects as the `toolhive-auth` ACL user, restricted to the `~thv:auth:*` key pattern. - The **scaling use case** (`SessionStorageConfig`) only supports a - `passwordRef` — no username field — so it always authenticates as the + `passwordRef` with no username field, so it always authenticates as the **default** Redis user. To satisfy both on one instance, enable the default user with a password and @@ -896,11 +898,23 @@ Replace `YOUR_SCALING_PASSWORD` with the password you put in the `redis-password` Secret, and adjust the key patterns to match your `keyPrefix` values. -:::note +:::warning -`SessionStorageConfig` does not support Sentinel — it uses a direct Redis -address. Point `sessionStorage.address` at the Redis master pod directly rather -than the Sentinel service: +`SessionStorageConfig` does not support Sentinel. It requires a direct Redis +address and cannot follow a Sentinel-managed master if failover promotes a new +master. If your Sentinel cluster has replicas and failover enabled, hardcoding a +pod DNS (`redis-0.redis...`) will break session storage if that pod is no longer +the master. + +For this reason, the recommended approach is to run a **separate standalone +Redis instance** (the `Deployment` from the +[scaling section](#deploy-a-standalone-redis-instance)) for scaling workloads, +rather than sharing the Sentinel instance. If you do share the instance, disable +Redis replication so there is only ever one master and Sentinel cannot trigger +failover. + +If you share the Sentinel instance with replication disabled, point +`sessionStorage.address` at the master pod directly: ```yaml sessionStorage: diff --git a/docs/toolhive/guides-k8s/run-mcp-k8s.mdx b/docs/toolhive/guides-k8s/run-mcp-k8s.mdx index 4ec931cf..ce68c7cf 100644 --- a/docs/toolhive/guides-k8s/run-mcp-k8s.mdx +++ b/docs/toolhive/guides-k8s/run-mcp-k8s.mdx @@ -471,19 +471,22 @@ spec: backendReplicas: 3 sessionStorage: provider: redis - address: redis-master.toolhive-system.svc.cluster.local:6379 # Update to match your Redis Service location + address: redis.toolhive-system.svc.cluster.local:6379 db: 0 keyPrefix: mcp-sessions passwordRef: - name: redis-secret + name: redis-password key: password ``` When running multiple replicas, configure [Redis session storage](./redis-session-storage.mdx#horizontal-scaling-session-storage) -so that sessions are shared across pods. If you omit `replicas` or -`backendReplicas`, the operator defers replica management to an HPA or other -external controller. +so that sessions are shared across pods. The example above assumes you have +deployed Redis using the manifests in that guide, which create a Service named +`redis` and a Secret named `redis-password` in the `toolhive-system` namespace. +Update `address` and `passwordRef.name` if your Redis deployment uses different +names. If you omit `replicas` or `backendReplicas`, the operator defers replica +management to an HPA or other external controller. :::note diff --git a/docs/toolhive/guides-vmcp/scaling-and-performance.mdx b/docs/toolhive/guides-vmcp/scaling-and-performance.mdx index 12e846b6..1e81b3cf 100644 --- a/docs/toolhive/guides-vmcp/scaling-and-performance.mdx +++ b/docs/toolhive/guides-vmcp/scaling-and-performance.mdx @@ -76,17 +76,19 @@ spec: replicas: 3 sessionStorage: provider: redis - address: redis-master.toolhive-system.svc.cluster.local:6379 # Update to match your Redis Service location + address: redis.toolhive-system.svc.cluster.local:6379 db: 0 keyPrefix: vmcp-sessions passwordRef: - name: redis-secret + name: redis-password key: password ``` -See +The example above assumes Redis is deployed using the manifests in the [Horizontal scaling session storage](../guides-k8s/redis-session-storage.mdx#horizontal-scaling-session-storage) -for a complete Redis deployment guide. +guide, which creates a Service named `redis` and a Secret named `redis-password` +in the `toolhive-system` namespace. Update `address` and `passwordRef.name` to +match your Redis Service and Secret if they differ. :::warning From b2dc8e3af40261fc8e4638af49a81e0dc45b87ab Mon Sep 17 00:00:00 2001 From: Yolanda Robla Mota Date: Thu, 16 Apr 2026 09:34:12 +0200 Subject: [PATCH 3/3] Explain proxyrunner session-aware backend pod routing (#710) Users scaling backendReplicas > 1 had no explanation of why Redis is needed or how the proxy runner routes sessions to backend pods. - Add a "Session routing for backend replicas" subsection explaining that the proxy runner uses Redis to store session-to-pod mappings, what happens when a backend pod restarts, and why client-IP affinity alone is unreliable behind NAT - Absorb the standalone SessionStorageWarning note into the new subsection so the backendReplicas implication is explicit - Update Redis link to point directly to the horizontal scaling anchor Closes #708 Co-authored-by: Claude Sonnet 4.6 --- docs/toolhive/guides-k8s/run-mcp-k8s.mdx | 41 ++++++++++++++++-------- 1 file changed, 28 insertions(+), 13 deletions(-) diff --git a/docs/toolhive/guides-k8s/run-mcp-k8s.mdx b/docs/toolhive/guides-k8s/run-mcp-k8s.mdx index ce68c7cf..a2869366 100644 --- a/docs/toolhive/guides-k8s/run-mcp-k8s.mdx +++ b/docs/toolhive/guides-k8s/run-mcp-k8s.mdx @@ -453,17 +453,40 @@ The proxy runner handles authentication, MCP protocol framing, and session management; it is stateless with respect to tool execution. The backend runs the actual MCP server and executes tools. +### Session routing for backend replicas + +MCP connections are stateful: once a client establishes a session with a +specific backend pod, all subsequent requests in that session must reach the +same pod. When `backendReplicas > 1`, the proxy runner uses Redis to store a +session-to-pod mapping so every proxy runner replica knows which backend pod +owns each session. + +Without Redis, the proxy runner falls back to Kubernetes client-IP session +affinity on the backend Service, which is unreliable behind NAT or shared egress +IPs. If a backend pod is restarted or replaced, its entry in the Redis routing +table is invalidated and the next request reconnects to an available pod — +sessions are not automatically migrated between pods. + +:::note + +The `SessionStorageWarning` condition only fires when `spec.replicas > 1` +(multiple proxy runner pods). It does not fire when only `backendReplicas > 1`, +but Redis session storage is still strongly recommended in that case to ensure +reliable per-session pod routing. + +::: + Common configurations: - **Scale only the proxy** (`replicas: N`, omit `backendReplicas`): useful when auth and connection overhead is the bottleneck with a single backend. - **Scale only the backend** (omit `replicas`, `backendReplicas: M`): useful - when tool execution is CPU/memory-bound and the proxy is not a bottleneck. The - backend Deployment uses client-IP session affinity to route repeated - connections to the same pod - subject to the same NAT limitations as - proxy-level affinity. + when tool execution is CPU/memory-bound and the proxy is not a bottleneck. + Configure Redis session storage so the proxy runner can route requests to the + correct backend pod. - **Scale both** (`replicas: N`, `backendReplicas: M`): full horizontal scale. - Redis session storage is required when `replicas > 1`. + Redis session storage is required for reliable operation when `replicas > 1`, + and strongly recommended when `backendReplicas > 1`. ```yaml title="MCPServer resource" spec: @@ -488,14 +511,6 @@ Update `address` and `passwordRef.name` if your Redis deployment uses different names. If you omit `replicas` or `backendReplicas`, the operator defers replica management to an HPA or other external controller. -:::note - -The `SessionStorageWarning` condition fires only when `spec.replicas > 1`. -Scaling only the backend (`backendReplicas > 1`) does not trigger a warning, but -backend client-IP affinity is still unreliable behind NAT or shared egress IPs. - -::: - :::note[Connection draining on scale-down] When a proxy runner pod is terminated (scale-in, rolling update, or node