[server] Optimize RemoteLogFetcher with async prefetch for recovery by Kaixuan-Duan · Pull Request #3132 · apache/fluss

Kaixuan-Duan · 2026-04-19T17:31:50Z

Purpose

Linked issue: close #3091
This PR improves KV recovery performance by reducing wait time between remote log segments in RemoteLogFetcher.

Brief change log

Add a dedicated single-thread download executor for async prefetch.
Prefetch the next fetchable remote segment while consuming the current one.
Reuse prefetched file when available; fallback to synchronous download on prefetch failure.
Ensure cleanup on close: cancel in-flight prefetch, close active iterator/resources, shutdown executor.
Add a regression test for repeated fetch() to ensure previous iterator cleanup.

Tests

./mvnw -pl fluss-server -Dtest=RemoteLogFetcherTest -DfailIfNoTests=false -Dspotless.check.skip=true test

API and Format
No API change. No storage/log format change.

Documentation
No user-facing feature. No documentation update required.

fresh-borzoni

@Kaixuan-Duan Thanks for the contribution. I will help to review this PR

One process point: this issue was already assigned and I was actively working on it. In that situation, please coordinate on the issue before opening an overlapping PR. Assignment is not exclusive ownership, but it is an important coordination signal, and skipping it usually leads to duplicated effort and fragmented review.

We can evaluate this PR on its merits, but for future cases please check on the issue first.

fresh-borzoni

Ty, direction is right, I left some cooments, PTAL

fresh-borzoni · 2026-04-23T04:07:49Z

+
+        private void cancelPrefetch() {
+            if (nextDownloadedSegmentFuture != null) {
+                nextDownloadedSegmentFuture.cancel(true);


cancel(true) on an already-completed future is a no-op and drops the reference to the downloaded File, which then lives in tempDir until fetcher-level close()

fresh-borzoni · 2026-04-23T04:09:09Z

+                activeIterator = null;
+            }
+        } finally {
+            downloadExecutor.shutdownNow();


shutdownNow() doesn't wait - if a prefetch is mid-flush, it can write to tempDir after deleteDirectoryQuietly runs. Either downloadExecutor.awaitTermination() with a short timeout before deletion, or make downloadSegment interruption-aware (most S3 SDKs don't honor Thread.isInterrupted() during socket reads, so the interrupt from shutdownNow is effectively decorative)

fresh-borzoni · 2026-04-23T04:18:35Z

        }

        @Override
        public boolean hasNext() {


If fetch() is called twice, the first Iterable still wraps the now-closed iterator and iterating it re-enters advance() on a closed instance, downloading into the shared tempDir, racing with the new iterator

fresh-borzoni · 2026-04-23T04:19:29Z

        }

+        private File fetchSegmentFile(RemoteLogSegment segment) throws IOException {
+            if (segment.equals(prefetchedSegment) && nextDownloadedSegmentFuture != null) {


This depends on RemoteLogSegment having value-based equals(), or on both references coming from the same segments list (reference equality). Works today, but safer to compare by segment id tbh.

fresh-borzoni · 2026-04-23T04:19:32Z

+            if (segment.equals(prefetchedSegment) && nextDownloadedSegmentFuture != null) {
+                try {
+                    return nextDownloadedSegmentFuture.get();
+                } catch (InterruptedException e) {


Also catch CancellationException - it's unchecked (extends RuntimeException) and CompletableFuture.get() throws it on a cancelled future. Not a live bug in the current state machine (every cancelPrefetch nulls the field) but cheap defense-in-depth, especially given closed is volatile.

fresh-borzoni · 2026-04-23T04:28:20Z

@@ -28,10 +28,13 @@
 import org.junit.jupiter.api.Test;


non-blocking: Two of the three new tests inject state via reflection (setPrivateField) instead of exercising a real async prefetch - they cover the branches in fetchSegmentFile, but not close-during-real-in-flight-download or the orphan-file cleanup.

Consider one integration-style test with a real slow/failing download source.

fresh-borzoni · 2026-04-23T04:30:28Z

+                            "Prefetched segment {} failed, fallback to sync download.",
+                            segment.remoteLogSegmentId(),
+                            e.getCause());
+                    return downloadSegment(segment);


Non-blocking: No retry on transient S3 failure - one flaky segment fails the entire recovery. In fluss-rust we added exponential backoff (100ms -> 5s with jitter) for this.

fresh-borzoni · 2026-04-23T04:35:38Z

+            return downloadSegment(segment);
+        }
+
+        private void prefetchNextSegment() {


Prefetch depth hardcoded to 1. If S3 p99 download time > consume time for a segment, the downloader sits idle and the optimization is half-realized. On the Rust side (fluss-rust #187) we landed on configurable depth with default 4 for exactly this reason. Since it's KV depth = 1 might be fine, but it's still better to configure and reason properly

Kaixuan-Duan added 2 commits April 20, 2026 00:44

[server] Optimize RemoteLogFetcher with async prefetch for recovery

e8f1a05

Add tests for RemoteLogFetcher prefetch fallback and cancellation

c16f10a

fresh-borzoni reviewed Apr 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[server] Optimize RemoteLogFetcher with async prefetch for recovery#3132

[server] Optimize RemoteLogFetcher with async prefetch for recovery#3132
Kaixuan-Duan wants to merge 2 commits intoapache:mainfrom
Kaixuan-Duan:remote-log-fetcher-prefetch

Kaixuan-Duan commented Apr 19, 2026

Uh oh!

fresh-borzoni left a comment

Uh oh!

fresh-borzoni left a comment

Uh oh!

fresh-borzoni Apr 23, 2026

Uh oh!

fresh-borzoni Apr 23, 2026

Uh oh!

fresh-borzoni Apr 23, 2026

Uh oh!

fresh-borzoni Apr 23, 2026

Uh oh!

fresh-borzoni Apr 23, 2026

Uh oh!

fresh-borzoni Apr 23, 2026

Uh oh!

fresh-borzoni Apr 23, 2026

Uh oh!

fresh-borzoni Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Kaixuan-Duan commented Apr 19, 2026

Uh oh!

fresh-borzoni left a comment

Choose a reason for hiding this comment

Uh oh!

fresh-borzoni left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants