[FLINK-39931] Fix flaky AdaptiveBatchSchedulerTest due to unexpected ComponentMainThreadExecutor behavior in async execution#28481
Open
ocb3916 wants to merge 4 commits into
Open
[FLINK-39931] Fix flaky AdaptiveBatchSchedulerTest due to unexpected ComponentMainThreadExecutor behavior in async execution#28481ocb3916 wants to merge 4 commits into
ocb3916 wants to merge 4 commits into
Conversation
f99a293 to
c14c61c
Compare
6023c28 to
dd03175
Compare
… to unexpected ComponentMainThreadExecutor behavior in async execution Co-authored-by: Yuepeng Pan <hipanyuepeng@gmail.com>
Author
|
Hi @RocMarshal! I have completed the PR and it's now ready for your review. |
spuru9
reviewed
Jun 21, 2026
spuru9
left a comment
Contributor
There was a problem hiding this comment.
The description is out of sync with the actual diff — just please update the change log to match what's actually here.
Co-authored-by: Purushottam Sinha <sinhapurushottam911@gmail.com>
Co-authored-by: Purushottam Sinha <sinhapurushottam911@gmail.com>
spuru9
reviewed
Jun 21, 2026
| // CompletableFuture callbacks are dispatched from background IO executor threads. | ||
| // The synchronous execution is preserved while eliminating the race condition. | ||
| mainThreadExecutor = new SynchronousComponentMainThreadExecutor(); | ||
| // No-op thread check avoids flakiness from FLINK-38970: scheduler callbacks |
Contributor
There was a problem hiding this comment.
@ocb3916 I think the suggestion didnt work as intended, might have to do locally.
Co-authored-by: Purushottam Sinha <sinhapurushottam911@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is the purpose of the change
This PR fixes flaky failures in
AdaptiveBatchSchedulerTest(FLINK-38970) caused by main-thread constraint violations whenCompletableFuturecallbacks are dispatched from background IO executor threads.The test previously used
SynchronousComponentMainThreadExecutorServiceAdapter, which enforces strict thread-identity checks viaassertRunningInMainThread(). Because someCompletableFuturecallbacks in the code under test can be completed from IO threads rather than the main thread, this strict check occasionally failed nondeterministically.The fix replaces it with
NoMainThreadCheckComponentMainThreadExecutor, which preserves synchronous execution semantics (tasks still run immediately on the calling thread) while skipping the thread-identity assertion, eliminating the race condition without weakening the actual scheduling behavior under test.Brief change log
AdaptiveBatchSchedulerTest.javaorg.apache.flink.runtime.concurrent.ComponentMainThreadExecutorServiceAdapterwithorg.apache.flink.runtime.concurrent.NoMainThreadCheckComponentMainThreadExecutororg.apache.flink.runtime.testutils.DirectScheduledExecutorService,java.util.concurrent.Callable,java.util.concurrent.ScheduledFuture,java.util.concurrent.TimeUnitsetUp(): replacedmainThreadExecutor = new SynchronousComponentMainThreadExecutor();withmainThreadExecutor = new NoMainThreadCheckComponentMainThreadExecutor();, with a comment explaining the FLINK-38970 rationaleSynchronousComponentMainThreadExecutor— this logic is now provided by the sharedNoMainThreadCheckComponentMainThreadExecutorutility instead of being duplicated locallyVerifying this change
Run the following test cases and confirm all repetitions pass without failure:
Expected result: all 10,000 repetitions of
testVertexInitializationFailureIsLabeledand all 1,000 repetitions oftestUnbalancedInputcomplete successfully, with no main-thread constraint violation failures.Does this pull request potentially affect one of the following parts?
@Public/@PublicEvolvingWas generative AI tooling used to co-author this PR?