redo destroy pg by JacksonYao287 · Pull Request #435 · eBay/HomeObject

JacksonYao287 · 2026-06-16T09:26:42Z

No description provided.

codecov-commenter · 2026-06-16T10:10:56Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 17.24138% with 24 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (stable/v4.x@e1c23e1). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
src/lib/homestore_backend/hs_pg_manager.cpp	0.00%	19 Missing ⚠️
...ib/homestore_backend/replication_state_machine.cpp	55.55%	4 Missing ⚠️
src/lib/homestore_backend/gc_manager.cpp	0.00%	0 Missing and 1 partial ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@              Coverage Diff               @@
##             stable/v4.x     #435   +/-   ##
==============================================
  Coverage               ?   45.60%           
==============================================
  Files                  ?       36           
  Lines                  ?     5436           
  Branches               ?      685           
==============================================
  Hits                   ?     2479           
  Misses                 ?     2683           
  Partials               ?      274

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

xiaoxichen · 2026-06-16T12:12:40Z

+    // when reaching here, we will not receive any new log, both for BR or leave_group case. we wait for all logs to be
+    // committed and trigger a cp, so that when restart, no log will be replayed since cp_lsn is equal to
+    // last_append_lsn.
+    while (true) {


I feel this while loop is probably wrong.
if this member receive a leave_group, who will tell it to commit all appened log?
similar in BR, if there are some appended but not yet committed log in logstore, but leader decide to send us a BR, probably leader wont tell us to commit

what is the intent here?
avoid the ongoing commit thread racing with destroying?
crash recovery?

what is the intent here?
avoid the ongoing commit thread racing with destroying?
crash recovery?

both.
1 currently , we pause statemachine here to avoid the ongoing commit thread racing with destroying pg. here, I try to achieve this by waiting for all the appended log to be committed, which means no more log will not be commit anymore . as a result, there is no racing between committing log and destroying pg.

2 crash recovery. after all the appended log are committed and a cp is triggered, when recovery, no log replay will happen(cp_lsn == last_append_lsn), so that we can avoid log replay to access any stale pg resource. for example, we have a put blob log, whose lsn is larger than cp_lsn but smaller than dc_lsn. if destroy_pg is called and the pg_index_Table is destroyed but before pg_super_blk is destoryed, crash occures, Then, when recovery and replay this put blob log, it will try to access the destroyed pg_index_table. other pg resources also has this potential issue(shard or pg chunk)

if this member receive a leave_group, who will tell it to commit all appened log?
similar in BR, if there are some appended but not yet committed log in logstore, but leader decide to send us a BR, probably leader wont tell us to commit

let`s say, leader starts from lsn 30(lsn 1 - lsn 29 are truncated), and the last append log in follower is lsn 20 and last commit log in follower is 10. there is an important fact that any truncated log must be already committed before it is truncated. so the commit_lsn of the whole raft group is at least lsn 29.

leader will start sending snapshot to follower only if it sends an append_entry req to follower and the next_lsn (in follower resp) follower wants is smaller than the smallest lsn leader currently has. note that, in the append_entry req, follower can see the latest commit_lsn in the raft group(at least lsn 29), which must be greater than the follower`s last_append_lsn（lsn 20）.

to answer your question, the append_entry req , which will trigger install_snapshot, will tell the laggy follower what is the latest commit_lsn currently. and the latest commit_lsn must be greater than the last_append_lsn in follower , which can make sure all the appended logs to be committed.

let`s say, leader starts from lsn 30(lsn 1 - lsn 29 are truncated), and the last append log in follower is lsn 20 and last commit log in follower is 10. there is an important fact that any truncated log must be already committed before it is truncated. so the commit_lsn of the whole raft group is at least lsn 29.

This is not necessary true, we keep maximum raft_logstore_reserve_threshold logs,

repl_lsn_t raft_logstore_reserve_threshold = HS_DYNAMIC_CONFIG(resource_limits.raft_logstore_reserve_threshold); repl_lsn_t truncation_upper_limit = std::max(leader_commit_idx - raft_logstore_reserve_threshold, minimum_repl_idx);

to answer your question, the append_entry req , which will trigger install_snapshot, will tell the laggy follower what is the latest commit_lsn currently. and the latest commit_lsn must be greater than the last_append_lsn in follower , which can make sure all the appended logs to be committed.

This part is wrong as well, say if a member has log up to 150, commited to 100, however it gets a install_snapshot_req, it wont know the commit lsn of leader(say 50000) , and it wont commit locally to 150.

raft_logstore_reserve_threshold does not matter , only last_append_lsn and last_commit_lsn matters.

however it gets a install_snapshot_req

in what case it will get a install_snapshot_req? as I said , the only case is that leader sends an append_entry req to follower and the next_lsn in follower resp, which indicates the next log follower wants, is smaller than the smallest lsn leader currently has. does this make sense?

the append_entry will have a commit_index , which indicates the lsn that the follower can commit to. commit lsn must be larger than followers last_append_lsn.

if install_snapshot is triggered, then
leader_commit_lsn >= leader_smallest_lsn > follower_last_append_lsn >= follower_commit_lsn

IIRC , a rejected append_log_request would not goes to the bottom and update the target_commit_index of the follower.

xiaoxichen · 2026-06-16T12:40:19Z

@@ -499,7 +503,7 @@ void ReplicationStateMachine::write_snapshot_obj(std::shared_ptr< homestore::sna
        if (home_object_->pg_exists(pg_data->pg_id())) {
            LOGI("pg already exists, clean pg resources before snapshot, pg={} {}", pg_data->pg_id(), log_suffix);
            // Need to pause state machine before destroying the PG, if fail, let raft retry.


comments out of date, as well as we dont have a branch that returns false as of now.

let`s remove this out-of-date comments after addressing other comments for this PR

xiaoxichen · 2026-06-16T12:43:20Z

+        // clean up the stale pg resources since no message will be received to trigger pg_destroy again.
+
+        // so, we call it here for all the above cases to make sure the stale pg resources are cleaned up.
+        home_object_->destroy_pg_resource(pg_id);


do we need cleaning GC resources? like any pending gc metablk? gc index table? no_space_left_marker? latch_lsn?

gc index table is pdev related , not pg. no_space_left_marker and latch_lsn has already done in this PR, see reset_latch_lsn, reset_no_space_left_inf.

the gc metablk is a nice catch. there is indeed a case that a gc metablk left when crash. let`s do it in handle_recovered_gc_task. if we find the pg state is destroyed, skip this task and destroy this gc meta blk.

after checking , no need to handle gc metablk.

if there is any gc metablk, it means before crash, drain_pending_gc_task is not returned yet(and thus no pg resource is destroyed). when recovery , these gc metablk will be handle by handle_recovered_gc_task before starting log replay.

JacksonYao287 requested review from Besroy, xiaoxichen and yuwmao June 16, 2026 10:14

redo destroy pg

c686ce2

JacksonYao287 force-pushed the fix-issues-of-destroy-pg branch from 0b5e4a4 to c686ce2 Compare June 16, 2026 11:59

xiaoxichen reviewed Jun 16, 2026

View reviewed changes

Conversation

JacksonYao287 commented Jun 16, 2026

Uh oh!

codecov-commenter commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JacksonYao287 Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xiaoxichen Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Jun 16, 2026 •

edited

Loading

JacksonYao287 Jun 16, 2026 •

edited

Loading

xiaoxichen Jun 16, 2026 •

edited

Loading