[improvement](recycler) Avoid single-point read/write during sequentially reading key by wyxxxcat · Pull Request #62476 · apache/doris

wyxxxcat · 2026-04-14T03:52:52Z

What problem does this PR solve?

fix: #58459

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

hello-stephen · 2026-04-14T03:53:00Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

wyxxxcat · 2026-04-14T03:58:32Z

/review

github-actions

Blocking issue found.

Goal of this PR
Reduce per-rowset point reads/writes while scanning recycler keys. The batching direction is reasonable, but the current implementation changes recycler semantics on a correctness-critical path and is not safe as written.
Critical checkpoint conclusions

Goal accomplished: Partially. Point operations are reduced, but the deferred abort path introduces a race that can delete rowset data for a transaction/job that commits before loop_done() runs.
Change size/focus: Focused to recycler batching, but it changes behavior in both recycle_rowsets() and recycle_tmp_rowsets().
Concurrency: Involved and currently unsafe. The recycler scan thread now queues abort work and only executes it at batch end, while concurrent commit_rowset / commit_txn / finish_tablet_job RPCs can still succeed during that window when enable_mark_delete_rowset_before_recycle=false.
Lifecycle/static init: No special lifecycle or static initialization concerns found in this PR.
Config changes: No new configs added, but an existing supported config combination regresses: enable_abort_txn_and_job_for_delete_rowset_before_recycle=true with enable_mark_delete_rowset_before_recycle=false.
Compatibility: No protocol/storage compatibility change observed.
Parallel paths: The same regression exists in both formal recycle-rowset and tmp-rowset paths.
Special conditions: The existing end_version() != 1 gate remains; no new explanation issues beyond the race above.
Test coverage: Existing tests mainly exercise the mark-before-delete flow and do not cover the interleaving where a commit/job-finish wins the race before deferred abort execution. I did not find new coverage for this regression.
Observability: Logging is adequate for tracing the new path.
Transaction/persistence: No new persistence format issue, but transaction/job state handling is where the correctness regression is introduced.
Data writes/modifications: Not safe on the affected path because object deletion can proceed after a successful commit that happened before the deferred abort ran.
FE/BE variable passing: Not applicable.
Performance: The batching optimization is valid in principle.
Other issues: No second independent blocker found beyond the deferred-abort race.

Recycler-specific checkpoints

Mark-before-delete two-phase flow: Preserved when mark-delete is enabled.
Abort-before-delete aligned with origin: The mapping is still correct (load -> txn, compaction/schema-change -> job), but timing is no longer safe because abort is deferred.
Packed files: Not affected by this PR.
Conflict/retry/idempotency: The new deferred abort flow is not restart-safe enough on the affected config path because a concurrent commit can invalidate the recycler's stale decision before deletion submission.

Because this is a data-correctness issue on a supported code path, this should not be approved yet.

github-actions · 2026-04-14T04:20:05Z

cloud/src/recycler/recycler.cpp

+            rowset_meta->end_version() != 1) {
+            if (auto abort_task = make_deferred_abort_task(rowset); abort_task.has_value()) {
+                LOG(INFO) << "rowset queued to abort related txn or job after current scan batch, "
+                             "instance_id="


Deferring this abort until loop_done() introduces a correctness race when enable_mark_delete_rowset_before_recycle=false. Before this PR, abort_txn_or_job_for_recycle() ran immediately after we saw the expired recycle-rowset key, so a concurrent commit_rowset / finish_tablet_job could not make the rowset live before deletion was scheduled. Now the scanner only queues the abort and keeps walking the batch. A concurrent commit can succeed in that window, and then loop_done() will still submit deletion based on the stale scan result. If the later abort sees an already-committed txn/job and returns 0, we end up deleting committed rowset data. The batching optimization needs to keep the abort-before-delete invariant for this supported config path.

github-actions · 2026-04-14T04:20:05Z

cloud/src/recycler/recycler.cpp

-                return ret;
+            if (auto abort_task = make_deferred_abort_task(rowset); abort_task.has_value()) {
+                LOG(INFO) << "rowset queued to abort related txn or job after current scan batch, "
+                             "instance_id="


Same race in the tmp-rowset path. With mark-before-delete disabled, moving the abort from the scan loop to loop_done() widens the window where commit_rowset / commit_txn / finish_tablet_job can succeed before the recycler aborts the owner txn/job. After that, the worker still deletes the tmp rowset based on a stale snapshot from the scan. This regresses the old abort-before-delete behavior and can delete rowset data that has just become visible.

wyxxxcat requested review from dataroaring, gavinchou and w41ter as code owners April 14, 2026 03:52

wyxxxcat added dev/3.1.x cloud dev/4.0.x labels Apr 14, 2026

github-actions bot requested changes Apr 14, 2026

View reviewed changes

wyxxxcat marked this pull request as draft April 14, 2026 07:23

1

23e5acb

wyxxxcat force-pushed the reduce_point_read_at_recycle branch from 342cec4 to 23e5acb Compare April 14, 2026 07:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[improvement](recycler) Avoid single-point read/write during sequentially reading key#62476

[improvement](recycler) Avoid single-point read/write during sequentially reading key#62476
wyxxxcat wants to merge 1 commit intoapache:masterfrom
wyxxxcat:reduce_point_read_at_recycle

wyxxxcat commented Apr 14, 2026 •

edited

Loading

Uh oh!

hello-stephen commented Apr 14, 2026

Uh oh!

wyxxxcat commented Apr 14, 2026

Uh oh!

github-actions bot left a comment

Uh oh!

github-actions bot Apr 14, 2026

Uh oh!

github-actions bot Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wyxxxcat commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

hello-stephen commented Apr 14, 2026

Uh oh!

wyxxxcat commented Apr 14, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wyxxxcat commented Apr 14, 2026 •

edited

Loading