fix(cluster): recover sharded pubsub topology after node reconnects#3223
fix(cluster): recover sharded pubsub topology after node reconnects#3223PavelPashov wants to merge 10 commits intoredis:masterfrom
Conversation
3cb7843 to
8281293
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit 6008ca9. Configure here.
|
|
||
| if (!this.#isOpen) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Loop uses continue instead of break when closed
Low Severity
When !this.#isOpen in #discoverWithKnownNodeCandidates, the code uses continue instead of break. This causes the loop to spin through all remaining candidates doing nothing, rather than exiting immediately. More importantly, this is inconsistent with #discoverWithRootNodes which throws 'Cluster closed' on the same check. The continue semantically implies "skip this candidate and try the next," but the actual intent is "stop trying because the cluster is closed."
Reviewed by Cursor Bugbot for commit 6008ca9. Configure here.
There was a problem hiding this comment.
I will pick this change in a separate PR, you can remove it from here


Description
Checklist
npm testpass with this change (including linting)?Note
Medium Risk
Changes core cluster reconnection and rediscovery behavior, which can affect topology refresh frequency and node selection during instability. Guardrails exist (strategy validation, refresh de-duplication), but the behavior is new and exercised mainly via tests.
Overview
Improves cluster resilience by triggering background topology rediscovery after post-ready node reconnection attempts, using a new
ClusterReconnectionTrackerand a configurabletopologyRefreshOnReconnectionAttemptStrategyoption (default 5s;false/0disables; function supported) with refresh de-duplication and exclusion of currently reconnecting node addresses.RedisClusterSlotsnow tracks reconnecting node clients viareconnecting/ready/endevents, clears tracking on destroy/unsubscribe/node removal, and expands rediscovery to try known nodes first (optionally excluding addresses) before falling back to root nodes.Tests are strengthened with a full
ClusterReconnectionTrackerunit suite and a major refactor/expansion of sharded Pub/Sub E2E scenarios to cover multiple failure modes (failover, node/proxy/shard failures) and recovery delivery expectations; small typing/lint tweaks accompany this.Reviewed by Cursor Bugbot for commit 6aae3fd. Bugbot is set up for automated code reviews on this repo. Configure here.