Skip to content

fix(lease-read): only invalidate lease on leadership-loss errors#558

Merged
bootjp merged 3 commits intomainfrom
fix/lease-invalidate-leadership-only
Apr 20, 2026
Merged

fix(lease-read): only invalidate lease on leadership-loss errors#558
bootjp merged 3 commits intomainfrom
fix/lease-invalidate-leadership-only

Conversation

@bootjp
Copy link
Copy Markdown
Owner

@bootjp bootjp commented Apr 20, 2026

背景: 本番で lease fast path が効いていない

PR #549 をマージ後、本番クラスタのメトリクス:

  • EVALSHA avg 6.3 s/op
  • redis.call() avg 6.25 s/call
  • GET avg 1.11 s/op

すべて LinearizableRead (heartbeat 往復) を経由しており、lease fast path に到達していない。

原因

refreshLeaseAfterDispatchleaseRefreshingTxn.Commit/Abort任意の err を leadership-loss とみなして lease を invalidate していた。本番の Lua retry ループで write-conflict が頻発しており、その都度 lease が invalidate される。次の LeaseRead は slow path に落ち、heartbeat 往復で 1 秒以上、その後の write で再び invalidate、の悪循環。

Fix

isLeadershipLossError(err) ヘルパーを追加し、真の leadership loss のみ invalidate:

  • hashicorp raft.ErrNotLeader / raft.ErrLeadershipLost / raft.ErrLeadershipTransferInProgress
  • etcd engine の "not leader" / "leadership transfer" / "leadership lost" sentinel (cockroachdb/errors が errors.Is を traverse しないことがあるので substring match)

真の leadership loss は RegisterLeaderLossCallback でカバー済、lease fast path も engine.State() == StateLeader でガード済なので、この変更で安全性は低下せず、write-conflict 嵐での擬似 invalidation だけ防げる。

Test plan

  • go test -race ./kv/... パス
  • デプロイ後、elastickv_redis_request_duration_seconds の GET avg が ms オーダーに下がることを確認
  • elastickv_lua_redis_call_duration_seconds の avg が script あたり少数回の LinearizableRead 相当まで下がることを確認

Summary by CodeRabbit

  • Bug Fixes

    • Refined lease invalidation behavior to only trigger on actual leadership loss errors, rather than invalidating on all dispatch failures. This improves system reliability when handling transient errors.
  • Tests

    • Added comprehensive test coverage for leadership error detection.

Production observation (prod cluster on 192.168.0.210, all-reads
slow path):
  EVALSHA avg 6.3 s/op, redis.call() avg 6.25 s/call,
  GET      avg 1.11 s/op.

The lease fast path is never taken because every Dispatch error --
write-conflicts in particular, which are frequent in the Lua
retry-loop workload -- invalidates the per-shard lease. The next
LeaseRead then falls through to LinearizableRead, which blocks on a
heartbeat round with followers; after the read, a subsequent write
error invalidates again, ad infinitum. The lease therefore never
stays warm long enough for a fast-path read to hit.

Root cause: refreshLeaseAfterDispatch (Coordinate.Dispatch) and
leaseRefreshingTxn.Commit/Abort treat any err as a leadership-loss
signal, but that is too aggressive. Write-conflict / validation /
deadline-on-a-non-ReadIndex-path errors are business-logic failures
that do NOT mean this node stopped being leader.

Fix: introduce isLeadershipLossError(err) and invalidate ONLY when
it returns true. Recognized signals are hashicorp raft.ErrNotLeader
/ raft.ErrLeadershipLost / raft.ErrLeadershipTransferInProgress,
plus substring matches against the etcd engine's "not leader" /
"leadership transfer" / "leadership lost" sentinels
(cockroachdb/errors wraps errors in a way that errors.Is does not
always traverse across package boundaries).

Real leadership loss is still caught by the engine's
RegisterLeaderLossCallback hook; the LeaseRead fast path also
guards on engine.State() == StateLeader for defense in depth.
Under this combination, a genuine step-down still invalidates the
lease promptly, but a storm of write-conflicts no longer carpet-
invalidates and lets the fast path actually serve reads.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

Warning

Rate limit exceeded

@bootjp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 54 minutes and 43 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 54 minutes and 43 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 92bd4fb1-5e66-4602-8354-852302699352

📥 Commits

Reviewing files that changed from the base of the PR and between edfa0fa and c414218.

📒 Files selected for processing (2)
  • internal/raftengine/etcd/engine_test.go
  • internal/raftengine/hashicorp/leadership_err_test.go
📝 Walkthrough

Walkthrough

The PR introduces shared sentinel errors for leadership-related failures across raft engine implementations, marking backend-specific errors with these shared sentinels via error wrapping. The KV layer now conditionally invalidates leases only on leadership loss, rather than on all dispatch or transaction errors.

Changes

Cohort / File(s) Summary
Raft Engine Infrastructure
internal/raftengine/engine.go
Added three exported sentinel errors (ErrNotLeader, ErrLeadershipLost, ErrLeadershipTransferInProgress) to enable cross-backend error classification.
Etcd Raft Backend
internal/raftengine/etcd/engine.go
Marked etcd raft errors (errNotLeader, errLeadershipTransferNotLeader) with the shared raftengine.ErrNotLeader sentinel via error marking.
Hashicorp Raft Backend
internal/raftengine/hashicorp/engine.go
Added translateLeadershipErr to map hashicorp raft leadership sentinels onto shared raftengine sentinels; updated error returns in Propose, VerifyLeader, CheckServing, LinearizableRead, and executeBarrier to use the translation function.
KV Lease State
kv/lease_state.go, kv/lease_state_test.go
Added isLeadershipLossError helper to detect leadership-related errors using cockroachdb/errors.Is semantics; includes comprehensive test covering direct sentinels, marked hashicorp raft errors, and false-case scenarios.
KV Coordinator & Transaction Layer
kv/coordinator.go, kv/sharded_coordinator.go
Modified refreshLeaseAfterDispatch and leaseRefreshingTxn to conditionally invalidate leases only when isLeadershipLossError returns true, rather than invalidating on all errors.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Sentinels stand, now shared across the fens,
Hashicorp and etcd, together again,
Leases hold steady when leaders remain true,
But spring free when leadership passes on through!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: limiting lease invalidation to only leadership-loss errors rather than all errors. This directly reflects the core fix across multiple files (coordinator.go, sharded_coordinator.go, lease_state.go).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/lease-invalidate-leadership-only

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refines the lease invalidation logic in the KV coordinator to prevent unnecessary performance degradation. Instead of invalidating the lease on any error during dispatch, commit, or abort operations, the system now only triggers invalidation when a leadership loss is detected via specific sentinel errors or error message substrings. This ensures that business-logic failures like write conflicts do not force subsequent reads into the slow linearizable path. Feedback was provided to improve the robustness of the error message matching by making it case-insensitive.

Comment thread kv/lease_state.go Outdated
…sentinels

Replace the substring-matching fallback in isLeadershipLossError with a
strict cockroachdb/errors.Is check against shared raftengine sentinels.

- Add raftengine.Err{NotLeader,LeadershipLost,LeadershipTransferInProgress}
  as the single source of truth both engine backends mark their internal
  errors against.
- etcd engine: mark errNotLeader and errLeadershipTransferNotLeader.
- hashicorp engine: translate raft.Err{NotLeader,LeadershipLost,
  LeadershipTransferInProgress} via a shared helper on all return sites.
- kv/lease_state.go: drop strings.Contains, drop the hashicorp/raft
  dependency, rely solely on the raftengine sentinels. Use
  cockroachdb/errors.Is since stdlib errors.Is does not traverse
  cockroachdb mark chains.
- Add TestIsLeadershipLossError covering the mark-based detection path
  and the negative cases (write-conflict-style errors, context cancel).
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 20, 2026

Strengthened leadership-loss detection per review feedback: replaced the substring-matching fallback with strict errors.Is against shared raftengine sentinels.

Changes (edfa0fa):

  • Added raftengine.Err{NotLeader,LeadershipLost,LeadershipTransferInProgress} as cross-backend sentinels.
  • etcd engine marks errNotLeader and errLeadershipTransferNotLeader against raftengine.ErrNotLeader.
  • hashicorp engine translates raft.Err{NotLeader,LeadershipLost,LeadershipTransferInProgress} via a shared helper on Apply/VerifyLeader/Barrier/CheckServing/LinearizableRead return sites.
  • kv.isLeadershipLossError now uses cockroachdb/errors.Is (stdlib errors.Is does not traverse cockroachdb marks — verified and documented in a comment) and drops the strings/hashicorp/raft imports entirely.
  • Added TestIsLeadershipLossError covering the mark-based detection path, direct-sentinel hits, and negative cases (write-conflict-style errors, context cancel, bare hashicorp errors without the engine-level mark).

go test -race ./kv/... ./internal/raftengine/... passes; golangci-lint clean.

@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 20, 2026

/gemini review

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
kv/coordinator.go (1)

223-226: ⚠️ Potential issue | 🟡 Minor

Update the stale invalidation contract.

The comment still says every dispatch error invalidates the lease, but Line 236 now correctly gates invalidation on isLeadershipLossError(err). Keeping this stale contract risks reintroducing the production regression later.

Proposed comment update
-// On err != nil the lease is invalidated: a Propose error commonly
-// signals leadership loss (non-leader rejection, transfer in
-// progress, quorum lost, etc.) and the design doc lists
-// "any error from engine.Propose" as an invalidation trigger.
+// On err != nil the lease is invalidated only for errors that
+// explicitly signal leadership loss. Business-logic failures such as
+// write conflicts or validation errors do not imply the lease holder
+// lost leadership and must not force subsequent reads onto the slow path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@kv/coordinator.go` around lines 223 - 226, The comment above the lease
invalidation logic is stale: it claims "any error from engine.Propose"
invalidates the lease, but the code now only invalidates when
isLeadershipLossError(err) returns true; update the comment near the lease
invalidation block (the text referencing engine.Propose and "any error") to
accurately state that only leadership-loss-type errors (as determined by
isLeadershipLossError(err)) trigger invalidation, and briefly describe examples
(non-leader rejection, transfer in progress, quorum lost) while removing the
blanket "any error" language to match the behavior implemented in
isLeadershipLossError.
internal/raftengine/etcd/engine.go (1)

66-79: ⚠️ Potential issue | 🟠 Major

Mark proposal errors during leadership transfer with the shared sentinel.

The etcd backend's handleProposal (lines 1030–1032) returns unmarked errors from e.rawNode.Propose, but when a proposal is rejected/dropped during an active leadership transfer (LeadTransferee != 0), the error should be marked with raftengine.ErrLeadershipTransferInProgress to ensure lease invalidation occurs. The hashicorp backend already does this via translateLeadershipErr(), but the etcd backend lacks equivalent translation.

Suggested fix

Define the sentinel error at the top of the file:

var (
	errNotLeader                   = errors.Mark(errors.New("etcd raft engine is not leader"), raftengine.ErrNotLeader)
+	errLeadershipTransferInProgress = errors.Mark(
+		errors.New("etcd raft leadership transfer in progress"),
+		raftengine.ErrLeadershipTransferInProgress,
+	)
	errNodeIDRequired              = errors.New("etcd raft node id is required")

Then in handleProposal, check the leadership transfer state and mark the proposal error accordingly before returning.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/raftengine/etcd/engine.go` around lines 66 - 79, Add a shared
sentinel error for leadership-transfer-in-progress at the top of the file (e.g.,
errLeadershipTransferInProgress = errors.Mark(errors.New("etcd raft leadership
transfer in progress"), raftengine.ErrLeadershipTransferInProgress)) and then
update handleProposal to translate errors returned from e.rawNode.Propose: if
e.raftStatus().LeadTransferee != 0 (or equivalent check of leadership transfer
state) and the propose returned an error, wrap/mark that error with the new
errLeadershipTransferInProgress (using errors.Mark) before returning so it
matches the shared sentinel used by other backends (mirroring
translateLeadershipErr behavior in the hashicorp backend).
🧹 Nitpick comments (1)
kv/lease_state_test.go (1)

30-45: Add test coverage for marked hashicorp transfer-in-progress error.

translateLeadershipErr in internal/raftengine/hashicorp/engine.go marks hashicorpraft.ErrLeadershipTransferInProgress with raftengine.ErrLeadershipTransferInProgress, but the test table covers direct sentinels and marked ErrNotLeader/ErrLeadershipLost cases without covering the marked transfer-in-progress case. Add this case to complete coverage of the translation branch.

Proposed test case
 		{
 			"hashicorp ErrLeadershipLost marked with raftengine sentinel",
 			cockroachdberrors.WithStack(cockroachdberrors.Mark(hashicorpraft.ErrLeadershipLost, raftengine.ErrLeadershipLost)),
 			true,
 		},
+		{
+			"hashicorp ErrLeadershipTransferInProgress marked with raftengine sentinel",
+			cockroachdberrors.WithStack(cockroachdberrors.Mark(hashicorpraft.ErrLeadershipTransferInProgress, raftengine.ErrLeadershipTransferInProgress)),
+			true,
+		},
 		{
 			"bare hashicorp ErrNotLeader (no raftengine mark) is NOT detected",
 			hashicorpraft.ErrNotLeader,
 			false,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@kv/lease_state_test.go` around lines 30 - 45, Add a test table entry that
exercises translateLeadershipErr's handling of a marked transfer-in-progress
error: create a case using
cockroachdberrors.WithStack(cockroachdberrors.Mark(hashicorpraft.ErrLeadershipTransferInProgress,
raftengine.ErrLeadershipTransferInProgress)) and expect detection (true). Locate
the test table in lease_state_test.go alongside the existing marked ErrNotLeader
and ErrLeadershipLost cases and add the new case description "hashicorp
ErrLeadershipTransferInProgress marked with raftengine sentinel" so
translateLeadershipErr is validated for this branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@internal/raftengine/etcd/engine.go`:
- Around line 66-79: Add a shared sentinel error for
leadership-transfer-in-progress at the top of the file (e.g.,
errLeadershipTransferInProgress = errors.Mark(errors.New("etcd raft leadership
transfer in progress"), raftengine.ErrLeadershipTransferInProgress)) and then
update handleProposal to translate errors returned from e.rawNode.Propose: if
e.raftStatus().LeadTransferee != 0 (or equivalent check of leadership transfer
state) and the propose returned an error, wrap/mark that error with the new
errLeadershipTransferInProgress (using errors.Mark) before returning so it
matches the shared sentinel used by other backends (mirroring
translateLeadershipErr behavior in the hashicorp backend).

In `@kv/coordinator.go`:
- Around line 223-226: The comment above the lease invalidation logic is stale:
it claims "any error from engine.Propose" invalidates the lease, but the code
now only invalidates when isLeadershipLossError(err) returns true; update the
comment near the lease invalidation block (the text referencing engine.Propose
and "any error") to accurately state that only leadership-loss-type errors (as
determined by isLeadershipLossError(err)) trigger invalidation, and briefly
describe examples (non-leader rejection, transfer in progress, quorum lost)
while removing the blanket "any error" language to match the behavior
implemented in isLeadershipLossError.

---

Nitpick comments:
In `@kv/lease_state_test.go`:
- Around line 30-45: Add a test table entry that exercises
translateLeadershipErr's handling of a marked transfer-in-progress error: create
a case using
cockroachdberrors.WithStack(cockroachdberrors.Mark(hashicorpraft.ErrLeadershipTransferInProgress,
raftengine.ErrLeadershipTransferInProgress)) and expect detection (true). Locate
the test table in lease_state_test.go alongside the existing marked ErrNotLeader
and ErrLeadershipLost cases and add the new case description "hashicorp
ErrLeadershipTransferInProgress marked with raftengine sentinel" so
translateLeadershipErr is validated for this branch.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 07827c13-903d-4305-82ca-e69fef917c0d

📥 Commits

Reviewing files that changed from the base of the PR and between 76894ae and edfa0fa.

📒 Files selected for processing (7)
  • internal/raftengine/engine.go
  • internal/raftengine/etcd/engine.go
  • internal/raftengine/hashicorp/engine.go
  • kv/coordinator.go
  • kv/lease_state.go
  • kv/lease_state_test.go
  • kv/sharded_coordinator.go

Add tests that lock in the cross-backend errors.Is contract introduced
in edfa0fa. Without these, a future refactor could drop the
errors.Mark calls and silently force every lease-read onto the slow
LinearizableRead path.

- etcd: assert errors.Is(errors.WithStack(errNotLeader), raftengine.ErrNotLeader) and the same for errLeadershipTransferNotLeader.
- hashicorp: cover translateLeadershipErr for all three raft sentinels, the pass-through case, and nil.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces shared sentinel errors for Raft leadership states and a utility function, isLeadershipLossError, to refine lease invalidation logic. By using errors.Mark and errors.Is, the system now distinguishes between business-logic failures and actual leadership loss, preventing unnecessary performance degradation. Feedback suggests extending this logic to the read paths in Coordinate.LeaseRead and ShardedCoordinator.groupLeaseRead to ensure consistency across the codebase.

Comment thread kv/lease_state.go
Comment on lines +24 to +31
func isLeadershipLossError(err error) bool {
if err == nil {
return false
}
return errors.Is(err, raftengine.ErrNotLeader) ||
errors.Is(err, raftengine.ErrLeadershipLost) ||
errors.Is(err, raftengine.ErrLeadershipTransferInProgress)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

この isLeadershipLossError の導入により、DispatchCommit/Abort における不要な lease invalidation を防げるようになりますが、Coordinate.LeaseReadShardedCoordinator.groupLeaseRead においても同様の対応が必要ではないでしょうか。現状、それらのメソッド内での LinearizableRead 失敗時は無条件で invalidate() が呼ばれています。タイムアウト等のリーダー交代を伴わない一時的なエラーで lease が破棄されると、後続の読み取りが不必要に slow path へ誘導されるため、本 PR の目的である「擬似的な invalidation による悪循環」を完全に解消するためには、読み取りパスの修正も検討すべきと考えます。

@bootjp bootjp enabled auto-merge April 20, 2026 13:52
@bootjp bootjp merged commit c67b6a0 into main Apr 20, 2026
5 checks passed
@bootjp bootjp deleted the fix/lease-invalidate-leadership-only branch April 20, 2026 13:52
bootjp added a commit that referenced this pull request Apr 20, 2026
…add transfer-in-progress sentinel

Addresses CodeRabbit and gemini feedback on PR #558.

- kv/coordinator.go LeaseRead + kv/sharded_coordinator.go groupLeaseRead:
  mirror the dispatch-path fix by invalidating the lease only when
  isLeadershipLossError reports a real leadership signal. Previously a
  LinearizableRead deadline or transient transport error would force the
  remainder of the lease window onto the slow path, reproducing the same
  regression the write-path fix addressed.
- internal/raftengine/etcd/engine.go: add errLeadershipTransferInProgress
  marked with raftengine.ErrLeadershipTransferInProgress, and fail Propose
  fast when BasicStatus().LeadTransferee != 0 so callers see an errors.Is-
  matchable error instead of hanging on a proposal etcd/raft silently
  drops during transfer.
- Refresh the stale comment on refreshLeaseAfterDispatch to reflect the
  filtered invalidation contract.
- Tests: add the hashicorp ErrLeadershipTransferInProgress marked case
  to kv.TestIsLeadershipLossError and pin errLeadershipTransferInProgress
  in the etcd sentinel test.
bootjp added a commit that referenced this pull request Apr 20, 2026
…add transfer-in-progress sentinel (#559)

## Summary
Follow-up to #558 addressing remaining CodeRabbit + gemini feedback.

- **kv/coordinator.go `LeaseRead` + kv/sharded_coordinator.go
`groupLeaseRead`**: mirror the dispatch-path fix — invalidate the lease
only when `isLeadershipLossError` reports a real leadership signal.
Previously a `LinearizableRead` deadline or transient transport error
would force the remainder of the lease window onto the slow path,
reproducing the same regression the write-path fix in #558 addressed.
- **internal/raftengine/etcd/engine.go**: add
`errLeadershipTransferInProgress` marked with
`raftengine.ErrLeadershipTransferInProgress`, and fail Propose fast when
`BasicStatus().LeadTransferee != 0` so callers see an
`errors.Is`-matchable error instead of hanging on a proposal that
etcd/raft silently drops during transfer.
- Refresh the stale comment on `refreshLeaseAfterDispatch` to reflect
the filtered invalidation contract.
- Tests: add the hashicorp `ErrLeadershipTransferInProgress` marked case
to `TestIsLeadershipLossError` and pin `errLeadershipTransferInProgress`
in the etcd sentinel test.

## Test plan
- [x] `go test -race ./kv/... ./internal/raftengine/...`
- [x] `golangci-lint` clean
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant