[fix](be) Fix TopN runtime filter activation#63969
Conversation
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: TopN runtime filters record their source node ids on the target scan node thrift, but the coordinator only marked OlapScan-related source SortNodes as runtime-predicate sources before serializing the plan. For non-Olap scans such as external file scans, the source SortNode could keep a non-runtime-predicate sort algorithm for larger limits, so BE received the ids but filtered them out because the runtime predicate source was never detected. This change marks every scan node's TopN filter source SortNode consistently in both coordinator thrift paths.
### Release note
Fix TopN runtime filter activation for non-Olap scan targets.
### Check List (For Author)
- Test: Unit Test
- ./run-fe-ut.sh --run org.apache.doris.qe.runtime.ThriftPlansBuilderTest,org.apache.doris.qe.CoordinatorTest#testTopnFilterDescsSharedAmongInstances
- ./build.sh --fe
- Behavior changed: Yes. TopN runtime filters on non-Olap scan targets now activate consistently with their serialized source node ids.
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve? Issue Number: N/A Related PR: apache#59088 Problem Summary: apache#59088 changed TopN runtime predicate target initialization to rely on a storage column id. For targets that cannot create a storage column predicate, such as non-pushdown TopN predicates or unsupported storage columns, init_target returned before marking the target as detected. That left RuntimePredicate disabled, so the scan side ignored the TopN source even though FE had sent the source id. This change keeps the target detected when no storage predicate is created, removes obsolete compatibility skips for missing runtime predicate descs, and adds FE/BE coverage for the source marking and no-column target paths. ### Release note None ### Check List (For Author) - Test: Unit Test - ./run-fe-ut.sh --run org.apache.doris.qe.runtime.ThriftPlansBuilderTest,org.apache.doris.qe.CoordinatorTest - ./run-be-ut.sh --run --filter=RuntimePredicateTest.* - cd fe && mvn checkstyle:check -pl fe-core -q - build-support/check-format.sh - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Pull request overview
This PR fixes TopN runtime predicate activation when the target cannot produce a storage-column predicate (e.g. non-pushdown TopN predicates or unsupported storage columns), ensuring the scan side still honors the TopN source id sent by FE. It also broadens FE-side “has runtime predicate” marking beyond OlapScanNode and adds FE/BE unit tests to cover the relevant paths.
Changes:
- BE: mark TopN runtime predicate target as detected even when
column_id < 0(no storage predicate created), soRuntimePredicatecan be enabled. - FE: mark TopN runtime predicate sources for all scan nodes (not just
OlapScanNode) during plan-to-thrift. - Tests: add FE unit tests for source marking and BE unit tests for “no-column target” enablement.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| fe/fe-core/src/test/java/org/apache/doris/qe/runtime/ThriftPlansBuilderTest.java | Adds a unit test ensuring runtime predicate marking occurs for non-Olap scan nodes. |
| fe/fe-core/src/test/java/org/apache/doris/qe/CoordinatorTest.java | Adds a unit test ensuring fragment thrift conversion marks TopN filter sources for non-Olap scans. |
| fe/fe-core/src/main/java/org/apache/doris/qe/runtime/ThriftPlansBuilder.java | Removes OlapScanNode restriction and exposes helper for test coverage. |
| fe/fe-core/src/main/java/org/apache/doris/qe/Coordinator.java | Removes OlapScanNode restriction when marking TopN runtime predicate presence. |
| be/test/runtime/runtime_predicate_test.cpp | Adds BE unit tests for init-target behavior with/without a valid column id. |
| be/src/runtime/runtime_predicate.cpp | Ensures targets are detected even when no column predicate can be created. |
| be/src/exec/operator/scan_operator.h | Removes legacy skip for missing runtime predicate descs when reading TopN sources. |
| be/src/exec/operator/scan_operator.cpp | Removes legacy skip for missing runtime predicate descs when initializing targets. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Status RuntimePredicate::init_target( | ||
| int32_t target_node_id, phmap::flat_hash_map<int, SlotDescriptor*> slot_id_to_slot_desc, | ||
| const int column_id) { | ||
| if (column_id < 0) { | ||
| _detected_target = true; | ||
| return Status::OK(); | ||
| } |
| std::vector<int> get_topn_filter_source_node_ids(RuntimeState* state, bool push_down) { | ||
| std::vector<int> result; | ||
| for (int id : _parent->cast<typename Derived::Parent>()._topn_filter_source_node_ids) { | ||
| if (!state->get_query_ctx()->has_runtime_predicate(id)) { | ||
| // compatible with older versions fe | ||
| continue; | ||
| } | ||
|
|
||
| const auto& pred = state->get_query_ctx()->get_runtime_predicate(id); | ||
| if (!pred.enable()) { |
| for (auto id : _topn_filter_source_node_ids) { | ||
| if (!state->get_query_ctx()->has_runtime_predicate(id)) { | ||
| // compatible with older versions fe | ||
| continue; | ||
| } | ||
|
|
||
| int cid = -1; | ||
| if (state->get_query_ctx()->get_runtime_predicate(id).target_is_slot(node_id())) { | ||
| auto s = _slot_id_to_slot_desc[state->get_query_ctx() |
| Status RuntimePredicate::init_target( | ||
| int32_t target_node_id, phmap::flat_hash_map<int, SlotDescriptor*> slot_id_to_slot_desc, | ||
| const int column_id) { | ||
| if (column_id < 0) { | ||
| _detected_target = true; | ||
| return Status::OK(); | ||
| } |
| std::vector<int> get_topn_filter_source_node_ids(RuntimeState* state, bool push_down) { | ||
| std::vector<int> result; | ||
| for (int id : _parent->cast<typename Derived::Parent>()._topn_filter_source_node_ids) { | ||
| if (!state->get_query_ctx()->has_runtime_predicate(id)) { | ||
| // compatible with older versions fe | ||
| continue; | ||
| } | ||
|
|
||
| const auto& pred = state->get_query_ctx()->get_runtime_predicate(id); | ||
| if (!pred.enable()) { |
| for (auto id : _topn_filter_source_node_ids) { | ||
| if (!state->get_query_ctx()->has_runtime_predicate(id)) { | ||
| // compatible with older versions fe | ||
| continue; | ||
| } | ||
|
|
||
| int cid = -1; | ||
| if (state->get_query_ctx()->get_runtime_predicate(id).target_is_slot(node_id())) { | ||
| auto s = _slot_id_to_slot_desc[state->get_query_ctx() |
|
run buildall |
|
/review |
There was a problem hiding this comment.
Reviewed the full PR against the Doris code-review checklist. I found one additional distinct issue and also confirmed the existing review threads are still blocking.
Critical checkpoint conclusions:
- Goal and proof: The PR aims to activate TopN runtime predicates for non-Olap/no-column-predicate cases. It partially achieves this, with FE and BE unit coverage, but still misses a slot-target no-storage-predicate case for VARBINARY.
- Scope and clarity: The FE changes are small and focused; the BE change is small but incomplete around unsupported storage-predicate columns.
- Concurrency: RuntimePredicate is shared between sort/source update and scan/target reads. Existing inline threads already cover the unlocked
_detected_targetwrite ininit_target(), which remains a blocker. - Lifecycle/static initialization: No new static/global lifecycle hazards found.
- Configuration/compatibility: No new configs. Existing inline threads already cover removal of mixed-version/missing-desc compatibility guards before
get_runtime_predicate(), which remains a blocker. - Parallel paths: Both Coordinator and ThriftPlansBuilder paths are updated for non-Olap source marking. The BE fallback path is not consistently applied to VARBINARY slot targets.
- Tests: Unit tests were added, but they do not cover the VARBINARY slot path or missing-desc compatibility.
- Observability/performance/transactions/persistence: No new concerns found for this PR.
User focus: No additional user-provided review focus was specified.
Because the PR still has blocking correctness/concurrency concerns, including the existing already-raised threads plus the additional VARBINARY fallback issue below, I am requesting changes.
| continue; | ||
| } | ||
|
|
||
| int cid = -1; |
There was a problem hiding this comment.
The new column_id < 0 fallback does not cover slot targets whose storage predicate is skipped for TYPE_VARBINARY. In this loop cid starts at -1, but the later VARBINARY branch still continues before calling init_target(), so _detected_target remains false and get_topn_filter_source_node_ids() filters the runtime predicate out. RuntimePredicate::_init() and VTopNPred both support varbinary comparisons, so this should fall back to init_target(node_id(), ..., -1) instead of leaving the TopN runtime filter disabled for ORDER BY varbinary_col LIMIT ....
TPC-H: Total hot run time: 29433 ms |
TPC-DS: Total hot run time: 171009 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
Problem Summary: #59088 changed TopN runtime predicate target initialization to rely on a storage column id. For targets that cannot create a storage column predicate, such as non-pushdown TopN predicates or unsupported storage columns, init_target returned before marking the target as detected. That left RuntimePredicate disabled, so the scan side ignored the TopN source even though FE had sent the source id. This PR keeps the target detected when no storage predicate is created, removes obsolete compatibility skips for missing runtime predicate descs, and adds FE/BE coverage for the source marking and no-column target paths.
Release note
None
Check List (For Author)