diff --git a/docs/toolhive/guides-k8s/run-mcp-k8s.mdx b/docs/toolhive/guides-k8s/run-mcp-k8s.mdx
index e0154331..5254899a 100644
--- a/docs/toolhive/guides-k8s/run-mcp-k8s.mdx
+++ b/docs/toolhive/guides-k8s/run-mcp-k8s.mdx
@@ -453,17 +453,40 @@ The proxy runner handles authentication, MCP protocol framing, and session
 management; it is stateless with respect to tool execution. The backend runs the
 actual MCP server and executes tools.
 
+### Session routing for backend replicas
+
+MCP connections are stateful: once a client establishes a session with a
+specific backend pod, all subsequent requests in that session must reach the
+same pod. When `backendReplicas > 1`, the proxy runner uses Redis to store a
+session-to-pod mapping so every proxy runner replica knows which backend pod
+owns each session.
+
+Without Redis, the proxy runner falls back to Kubernetes client-IP session
+affinity on the backend Service, which is unreliable behind NAT or shared egress
+IPs. If a backend pod is restarted or replaced, its entry in the Redis routing
+table is invalidated and the next request reconnects to an available pod —
+sessions are not automatically migrated between pods.
+
+:::note
+
+The `SessionStorageWarning` condition only fires when `spec.replicas > 1`
+(multiple proxy runner pods). It does not fire when only `backendReplicas > 1`,
+but Redis session storage is still strongly recommended in that case to ensure
+reliable per-session pod routing.
+
+:::
+
 Common configurations:
 
 - **Scale only the proxy** (`replicas: N`, omit `backendReplicas`): useful when
   auth and connection overhead is the bottleneck with a single backend.
 - **Scale only the backend** (omit `replicas`, `backendReplicas: M`): useful
-  when tool execution is CPU/memory-bound and the proxy is not a bottleneck. The
-  backend Deployment uses client-IP session affinity to route repeated
-  connections to the same pod - subject to the same NAT limitations as
-  proxy-level affinity.
+  when tool execution is CPU/memory-bound and the proxy is not a bottleneck.
+  Configure Redis session storage so the proxy runner can route requests to the
+  correct backend pod.
 - **Scale both** (`replicas: N`, `backendReplicas: M`): full horizontal scale.
-  Redis session storage is required when `replicas > 1`.
+  Redis session storage is required for reliable operation when `replicas > 1`,
+  and strongly recommended when `backendReplicas > 1`.
 
 ```yaml title="MCPServer resource"
 spec:
@@ -484,14 +507,6 @@ When running multiple replicas, configure
 across pods. If you omit `replicas` or `backendReplicas`, the operator defers
 replica management to an HPA or other external controller.
 
-:::note
-
-The `SessionStorageWarning` condition fires only when `spec.replicas > 1`.
-Scaling only the backend (`backendReplicas > 1`) does not trigger a warning, but
-backend client-IP affinity is still unreliable behind NAT or shared egress IPs.
-
-:::
-
 :::note[Connection draining on scale-down]
 
 When a proxy runner pod is terminated (scale-in, rolling update, or node