fix(lease-read): only invalidate lease on leadership-loss errors#558
fix(lease-read): only invalidate lease on leadership-loss errors#558
Conversation
Production observation (prod cluster on 192.168.0.210, all-reads slow path): EVALSHA avg 6.3 s/op, redis.call() avg 6.25 s/call, GET avg 1.11 s/op. The lease fast path is never taken because every Dispatch error -- write-conflicts in particular, which are frequent in the Lua retry-loop workload -- invalidates the per-shard lease. The next LeaseRead then falls through to LinearizableRead, which blocks on a heartbeat round with followers; after the read, a subsequent write error invalidates again, ad infinitum. The lease therefore never stays warm long enough for a fast-path read to hit. Root cause: refreshLeaseAfterDispatch (Coordinate.Dispatch) and leaseRefreshingTxn.Commit/Abort treat any err as a leadership-loss signal, but that is too aggressive. Write-conflict / validation / deadline-on-a-non-ReadIndex-path errors are business-logic failures that do NOT mean this node stopped being leader. Fix: introduce isLeadershipLossError(err) and invalidate ONLY when it returns true. Recognized signals are hashicorp raft.ErrNotLeader / raft.ErrLeadershipLost / raft.ErrLeadershipTransferInProgress, plus substring matches against the etcd engine's "not leader" / "leadership transfer" / "leadership lost" sentinels (cockroachdb/errors wraps errors in a way that errors.Is does not always traverse across package boundaries). Real leadership loss is still caught by the engine's RegisterLeaderLossCallback hook; the LeaseRead fast path also guards on engine.State() == StateLeader for defense in depth. Under this combination, a genuine step-down still invalidates the lease promptly, but a storm of write-conflicts no longer carpet- invalidates and lets the fast path actually serve reads.
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 54 minutes and 43 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe PR introduces shared sentinel errors for leadership-related failures across raft engine implementations, marking backend-specific errors with these shared sentinels via error wrapping. The KV layer now conditionally invalidates leases only on leadership loss, rather than on all dispatch or transaction errors. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refines the lease invalidation logic in the KV coordinator to prevent unnecessary performance degradation. Instead of invalidating the lease on any error during dispatch, commit, or abort operations, the system now only triggers invalidation when a leadership loss is detected via specific sentinel errors or error message substrings. This ensures that business-logic failures like write conflicts do not force subsequent reads into the slow linearizable path. Feedback was provided to improve the robustness of the error message matching by making it case-insensitive.
…sentinels
Replace the substring-matching fallback in isLeadershipLossError with a
strict cockroachdb/errors.Is check against shared raftengine sentinels.
- Add raftengine.Err{NotLeader,LeadershipLost,LeadershipTransferInProgress}
as the single source of truth both engine backends mark their internal
errors against.
- etcd engine: mark errNotLeader and errLeadershipTransferNotLeader.
- hashicorp engine: translate raft.Err{NotLeader,LeadershipLost,
LeadershipTransferInProgress} via a shared helper on all return sites.
- kv/lease_state.go: drop strings.Contains, drop the hashicorp/raft
dependency, rely solely on the raftengine sentinels. Use
cockroachdb/errors.Is since stdlib errors.Is does not traverse
cockroachdb mark chains.
- Add TestIsLeadershipLossError covering the mark-based detection path
and the negative cases (write-conflict-style errors, context cancel).
|
Strengthened leadership-loss detection per review feedback: replaced the substring-matching fallback with strict errors.Is against shared raftengine sentinels. Changes (edfa0fa):
|
|
/gemini review |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
kv/coordinator.go (1)
223-226:⚠️ Potential issue | 🟡 MinorUpdate the stale invalidation contract.
The comment still says every dispatch error invalidates the lease, but Line 236 now correctly gates invalidation on
isLeadershipLossError(err). Keeping this stale contract risks reintroducing the production regression later.Proposed comment update
-// On err != nil the lease is invalidated: a Propose error commonly -// signals leadership loss (non-leader rejection, transfer in -// progress, quorum lost, etc.) and the design doc lists -// "any error from engine.Propose" as an invalidation trigger. +// On err != nil the lease is invalidated only for errors that +// explicitly signal leadership loss. Business-logic failures such as +// write conflicts or validation errors do not imply the lease holder +// lost leadership and must not force subsequent reads onto the slow path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@kv/coordinator.go` around lines 223 - 226, The comment above the lease invalidation logic is stale: it claims "any error from engine.Propose" invalidates the lease, but the code now only invalidates when isLeadershipLossError(err) returns true; update the comment near the lease invalidation block (the text referencing engine.Propose and "any error") to accurately state that only leadership-loss-type errors (as determined by isLeadershipLossError(err)) trigger invalidation, and briefly describe examples (non-leader rejection, transfer in progress, quorum lost) while removing the blanket "any error" language to match the behavior implemented in isLeadershipLossError.internal/raftengine/etcd/engine.go (1)
66-79:⚠️ Potential issue | 🟠 MajorMark proposal errors during leadership transfer with the shared sentinel.
The etcd backend's
handleProposal(lines 1030–1032) returns unmarked errors frome.rawNode.Propose, but when a proposal is rejected/dropped during an active leadership transfer (LeadTransferee != 0), the error should be marked withraftengine.ErrLeadershipTransferInProgressto ensure lease invalidation occurs. The hashicorp backend already does this viatranslateLeadershipErr(), but the etcd backend lacks equivalent translation.Suggested fix
Define the sentinel error at the top of the file:
var ( errNotLeader = errors.Mark(errors.New("etcd raft engine is not leader"), raftengine.ErrNotLeader) + errLeadershipTransferInProgress = errors.Mark( + errors.New("etcd raft leadership transfer in progress"), + raftengine.ErrLeadershipTransferInProgress, + ) errNodeIDRequired = errors.New("etcd raft node id is required")Then in
handleProposal, check the leadership transfer state and mark the proposal error accordingly before returning.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/raftengine/etcd/engine.go` around lines 66 - 79, Add a shared sentinel error for leadership-transfer-in-progress at the top of the file (e.g., errLeadershipTransferInProgress = errors.Mark(errors.New("etcd raft leadership transfer in progress"), raftengine.ErrLeadershipTransferInProgress)) and then update handleProposal to translate errors returned from e.rawNode.Propose: if e.raftStatus().LeadTransferee != 0 (or equivalent check of leadership transfer state) and the propose returned an error, wrap/mark that error with the new errLeadershipTransferInProgress (using errors.Mark) before returning so it matches the shared sentinel used by other backends (mirroring translateLeadershipErr behavior in the hashicorp backend).
🧹 Nitpick comments (1)
kv/lease_state_test.go (1)
30-45: Add test coverage for marked hashicorp transfer-in-progress error.
translateLeadershipErrininternal/raftengine/hashicorp/engine.gomarkshashicorpraft.ErrLeadershipTransferInProgresswithraftengine.ErrLeadershipTransferInProgress, but the test table covers direct sentinels and markedErrNotLeader/ErrLeadershipLostcases without covering the marked transfer-in-progress case. Add this case to complete coverage of the translation branch.Proposed test case
{ "hashicorp ErrLeadershipLost marked with raftengine sentinel", cockroachdberrors.WithStack(cockroachdberrors.Mark(hashicorpraft.ErrLeadershipLost, raftengine.ErrLeadershipLost)), true, }, + { + "hashicorp ErrLeadershipTransferInProgress marked with raftengine sentinel", + cockroachdberrors.WithStack(cockroachdberrors.Mark(hashicorpraft.ErrLeadershipTransferInProgress, raftengine.ErrLeadershipTransferInProgress)), + true, + }, { "bare hashicorp ErrNotLeader (no raftengine mark) is NOT detected", hashicorpraft.ErrNotLeader, false,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@kv/lease_state_test.go` around lines 30 - 45, Add a test table entry that exercises translateLeadershipErr's handling of a marked transfer-in-progress error: create a case using cockroachdberrors.WithStack(cockroachdberrors.Mark(hashicorpraft.ErrLeadershipTransferInProgress, raftengine.ErrLeadershipTransferInProgress)) and expect detection (true). Locate the test table in lease_state_test.go alongside the existing marked ErrNotLeader and ErrLeadershipLost cases and add the new case description "hashicorp ErrLeadershipTransferInProgress marked with raftengine sentinel" so translateLeadershipErr is validated for this branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@internal/raftengine/etcd/engine.go`:
- Around line 66-79: Add a shared sentinel error for
leadership-transfer-in-progress at the top of the file (e.g.,
errLeadershipTransferInProgress = errors.Mark(errors.New("etcd raft leadership
transfer in progress"), raftengine.ErrLeadershipTransferInProgress)) and then
update handleProposal to translate errors returned from e.rawNode.Propose: if
e.raftStatus().LeadTransferee != 0 (or equivalent check of leadership transfer
state) and the propose returned an error, wrap/mark that error with the new
errLeadershipTransferInProgress (using errors.Mark) before returning so it
matches the shared sentinel used by other backends (mirroring
translateLeadershipErr behavior in the hashicorp backend).
In `@kv/coordinator.go`:
- Around line 223-226: The comment above the lease invalidation logic is stale:
it claims "any error from engine.Propose" invalidates the lease, but the code
now only invalidates when isLeadershipLossError(err) returns true; update the
comment near the lease invalidation block (the text referencing engine.Propose
and "any error") to accurately state that only leadership-loss-type errors (as
determined by isLeadershipLossError(err)) trigger invalidation, and briefly
describe examples (non-leader rejection, transfer in progress, quorum lost)
while removing the blanket "any error" language to match the behavior
implemented in isLeadershipLossError.
---
Nitpick comments:
In `@kv/lease_state_test.go`:
- Around line 30-45: Add a test table entry that exercises
translateLeadershipErr's handling of a marked transfer-in-progress error: create
a case using
cockroachdberrors.WithStack(cockroachdberrors.Mark(hashicorpraft.ErrLeadershipTransferInProgress,
raftengine.ErrLeadershipTransferInProgress)) and expect detection (true). Locate
the test table in lease_state_test.go alongside the existing marked ErrNotLeader
and ErrLeadershipLost cases and add the new case description "hashicorp
ErrLeadershipTransferInProgress marked with raftengine sentinel" so
translateLeadershipErr is validated for this branch.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 07827c13-903d-4305-82ca-e69fef917c0d
📒 Files selected for processing (7)
internal/raftengine/engine.gointernal/raftengine/etcd/engine.gointernal/raftengine/hashicorp/engine.gokv/coordinator.gokv/lease_state.gokv/lease_state_test.gokv/sharded_coordinator.go
Add tests that lock in the cross-backend errors.Is contract introduced in edfa0fa. Without these, a future refactor could drop the errors.Mark calls and silently force every lease-read onto the slow LinearizableRead path. - etcd: assert errors.Is(errors.WithStack(errNotLeader), raftengine.ErrNotLeader) and the same for errLeadershipTransferNotLeader. - hashicorp: cover translateLeadershipErr for all three raft sentinels, the pass-through case, and nil.
There was a problem hiding this comment.
Code Review
This pull request introduces shared sentinel errors for Raft leadership states and a utility function, isLeadershipLossError, to refine lease invalidation logic. By using errors.Mark and errors.Is, the system now distinguishes between business-logic failures and actual leadership loss, preventing unnecessary performance degradation. Feedback suggests extending this logic to the read paths in Coordinate.LeaseRead and ShardedCoordinator.groupLeaseRead to ensure consistency across the codebase.
| func isLeadershipLossError(err error) bool { | ||
| if err == nil { | ||
| return false | ||
| } | ||
| return errors.Is(err, raftengine.ErrNotLeader) || | ||
| errors.Is(err, raftengine.ErrLeadershipLost) || | ||
| errors.Is(err, raftengine.ErrLeadershipTransferInProgress) | ||
| } |
There was a problem hiding this comment.
この isLeadershipLossError の導入により、Dispatch や Commit/Abort における不要な lease invalidation を防げるようになりますが、Coordinate.LeaseRead や ShardedCoordinator.groupLeaseRead においても同様の対応が必要ではないでしょうか。現状、それらのメソッド内での LinearizableRead 失敗時は無条件で invalidate() が呼ばれています。タイムアウト等のリーダー交代を伴わない一時的なエラーで lease が破棄されると、後続の読み取りが不必要に slow path へ誘導されるため、本 PR の目的である「擬似的な invalidation による悪循環」を完全に解消するためには、読み取りパスの修正も検討すべきと考えます。
…add transfer-in-progress sentinel Addresses CodeRabbit and gemini feedback on PR #558. - kv/coordinator.go LeaseRead + kv/sharded_coordinator.go groupLeaseRead: mirror the dispatch-path fix by invalidating the lease only when isLeadershipLossError reports a real leadership signal. Previously a LinearizableRead deadline or transient transport error would force the remainder of the lease window onto the slow path, reproducing the same regression the write-path fix addressed. - internal/raftengine/etcd/engine.go: add errLeadershipTransferInProgress marked with raftengine.ErrLeadershipTransferInProgress, and fail Propose fast when BasicStatus().LeadTransferee != 0 so callers see an errors.Is- matchable error instead of hanging on a proposal etcd/raft silently drops during transfer. - Refresh the stale comment on refreshLeaseAfterDispatch to reflect the filtered invalidation contract. - Tests: add the hashicorp ErrLeadershipTransferInProgress marked case to kv.TestIsLeadershipLossError and pin errLeadershipTransferInProgress in the etcd sentinel test.
…add transfer-in-progress sentinel (#559) ## Summary Follow-up to #558 addressing remaining CodeRabbit + gemini feedback. - **kv/coordinator.go `LeaseRead` + kv/sharded_coordinator.go `groupLeaseRead`**: mirror the dispatch-path fix — invalidate the lease only when `isLeadershipLossError` reports a real leadership signal. Previously a `LinearizableRead` deadline or transient transport error would force the remainder of the lease window onto the slow path, reproducing the same regression the write-path fix in #558 addressed. - **internal/raftengine/etcd/engine.go**: add `errLeadershipTransferInProgress` marked with `raftengine.ErrLeadershipTransferInProgress`, and fail Propose fast when `BasicStatus().LeadTransferee != 0` so callers see an `errors.Is`-matchable error instead of hanging on a proposal that etcd/raft silently drops during transfer. - Refresh the stale comment on `refreshLeaseAfterDispatch` to reflect the filtered invalidation contract. - Tests: add the hashicorp `ErrLeadershipTransferInProgress` marked case to `TestIsLeadershipLossError` and pin `errLeadershipTransferInProgress` in the etcd sentinel test. ## Test plan - [x] `go test -race ./kv/... ./internal/raftengine/...` - [x] `golangci-lint` clean
背景: 本番で lease fast path が効いていない
PR #549 をマージ後、本番クラスタのメトリクス:
EVALSHAavg 6.3 s/opredis.call()avg 6.25 s/callGETavg 1.11 s/opすべて LinearizableRead (heartbeat 往復) を経由しており、lease fast path に到達していない。
原因
refreshLeaseAfterDispatchとleaseRefreshingTxn.Commit/Abortが 任意の err を leadership-loss とみなして lease を invalidate していた。本番の Lua retry ループで write-conflict が頻発しており、その都度 lease が invalidate される。次の LeaseRead は slow path に落ち、heartbeat 往復で 1 秒以上、その後の write で再び invalidate、の悪循環。Fix
isLeadershipLossError(err)ヘルパーを追加し、真の leadership loss のみ invalidate:raft.ErrNotLeader/raft.ErrLeadershipLost/raft.ErrLeadershipTransferInProgress真の leadership loss は
RegisterLeaderLossCallbackでカバー済、lease fast path もengine.State() == StateLeaderでガード済なので、この変更で安全性は低下せず、write-conflict 嵐での擬似 invalidation だけ防げる。Test plan
go test -race ./kv/...パスelastickv_redis_request_duration_secondsの GET avg が ms オーダーに下がることを確認elastickv_lua_redis_call_duration_secondsの avg が script あたり少数回の LinearizableRead 相当まで下がることを確認Summary by CodeRabbit
Bug Fixes
Tests