[feature](cloud) Add table-level event-driven warm up#63832
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
65920e0 to
b67c9f7
Compare
|
run buildall |
TPC-H: Total hot run time: 31875 ms |
TPC-DS: Total hot run time: 172324 ms |
FE Regression Coverage ReportIncrement line coverage |
### What problem does this PR solve? Issue Number: None Related PR: apache#63832 Problem Summary: The table-level warm-up change adds a table_id argument before sync_wait_timeout_ms in CloudWarmUpManager::warm_up_rowset. After rebasing onto the latest master, the existing CloudWarmUpManagerTest calls still used the old two-argument form, so the positive-timeout test passed 1000 as table_id and left sync_wait_timeout_ms at its default -1. That made the test take the async non-positive-timeout branch, so the before-wait sync point was never reached and the spurious notify assertion failed. Update the test calls to pass table_id and sync_wait_timeout_ms explicitly. ### Release note None ### Check List (For Author) - Test: - Unit Test: ./run-be-ut.sh --run --filter=CloudWarmUpManagerTest.* -j100 - Behavior changed: No. - Does this need documentation: No.
|
run buildall |
TPC-H: Total hot run time: 31958 ms |
TPC-DS: Total hot run time: 172417 ms |
### What problem does this PR solve? Issue Number: None Related PR: apache#63832 Problem Summary: The table-level warm-up table filter performance tests used tight wall-clock thresholds for the 200K and 500K wildcard match-all cases. CI machines can run these scale tests slightly slower than local runs even though the matching implementation remains efficient. Relax the 200K threshold from 1s to 1.5s and the 500K threshold from 2s to 3s while keeping the existing functional assertions and smaller or more selective performance checks. ### Release note None ### Check List (For Author) - Test: - Unit Test: ./run-fe-ut.sh --run org.apache.doris.cloud.CacheHotspotManagerTableFilterTest - Behavior changed: No. - Does this need documentation: No.
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
### What problem does this PR solve? Issue Number: None Related PR: apache#63832 Problem Summary: The table-level warm-up table filter performance test for 200K tables with 15 include/exclude rules still used a tight 2s wall-clock threshold. CI can exceed that threshold under load while the matcher remains functionally correct. Relax the threshold to 3s and keep the matched-table assertion unchanged. ### Release note None ### Check List (For Author) - Test: - Unit Test: ./run-fe-ut.sh --run org.apache.doris.cloud.CacheHotspotManagerTableFilterTest - Behavior changed: No. - Does this need documentation: No.
|
run buildall |
Issue Number: None
Related PR: None
Problem Summary: Add table-level event-driven warm-up support for cloud warm-up jobs. The change extends WARM UP ... ON TABLES parsing and validation, persists normalized include and exclude table filters, resolves matching table ids dynamically, prevents conflicting cluster-level and table-level load-event jobs, propagates table ids through BE warm-up requests, records per-job source and target warm-up progress metrics, and exposes compact and detailed SyncStats through SHOW WARM UP JOB and FE metrics. Virtual compute group rebuilds cancel existing table-level load-event jobs before recreating managed cluster-level jobs.
Support table-level event-driven cloud warm-up with ON TABLES filters and warm-up sync statistics.
- Test:
- Unit Test: ./run-fe-ut.sh --run org.apache.doris.cloud.OnTablesFilterTest,org.apache.doris.cloud.CloudWarmUpJobTableFilterTest,org.apache.doris.cloud.CacheHotspotManagerTableFilterTest,org.apache.doris.cloud.WarmUpStatsTest,org.apache.doris.cloud.WarmUpClusterOnTablesParseTest,org.apache.doris.cloud.catalog.CloudInstanceStatusCheckerTest,org.apache.doris.metric.MetricsTest#testCloudWarmUpSyncJobMetricsReadStatsDirectlyFromJob+testEventDrivenCloudWarmUpSyncJobTriggerGapMetric
- Unit Test: ./run-be-ut.sh --run --filter=CloudWarmUpManagerFilterTest.*:MBvarWindowedAdderTest.* -j100
- Manual test: build-support/check-format.sh
- Manual test: ./build.sh --be --fe --cloud -j100
- Manual test: docker build -f docker/runtime/doris-compose/Dockerfile -t bh-cluster-2 .
- Manual test: ./run-regression-test.sh --clean --compile
- Regression test: env -u HTTP_PROXY -u HTTPS_PROXY -u http_proxy -u https_proxy -u ALL_PROXY -u all_proxy ./run-regression-test.sh --run -d regression-test/suites/cloud_p0/cache/multi_cluster/warm_up/on_tables -runMode=cloud -image bh-cluster-2 -dockerSuiteParallel 1 (18/19 passed; test_warm_up_event_on_tables_overlap_and_mv failed due test SQL duplicate MV column name before the test was fixed)
- Regression test: env -u HTTP_PROXY -u HTTPS_PROXY -u http_proxy -u https_proxy -u ALL_PROXY -u all_proxy ./run-regression-test.sh --run -d regression-test/suites/cloud_p0/cache/multi_cluster/warm_up/on_tables -s test_warm_up_event_on_tables_overlap_and_mv -runMode=cloud -image bh-cluster-2 -dockerSuiteParallel 1
- Behavior changed: Yes. WARM UP supports ON TABLES filters for event-driven load warm-up and SHOW WARM UP JOB exposes table filter, matched tables, and sync stats.
- Does this need documentation: Yes. Documentation for the new ON TABLES syntax and metrics should be added separately.
### What problem does this PR solve? Issue Number: None Related PR: apache#63832 Problem Summary: The table-level warm-up change adds a table_id argument before sync_wait_timeout_ms in CloudWarmUpManager::warm_up_rowset. After rebasing onto the latest master, the existing CloudWarmUpManagerTest calls still used the old two-argument form, so the positive-timeout test passed 1000 as table_id and left sync_wait_timeout_ms at its default -1. That made the test take the async non-positive-timeout branch, so the before-wait sync point was never reached and the spurious notify assertion failed. Update the test calls to pass table_id and sync_wait_timeout_ms explicitly. ### Release note None ### Check List (For Author) - Test: - Unit Test: ./run-be-ut.sh --run --filter=CloudWarmUpManagerTest.* -j100 - Behavior changed: No. - Does this need documentation: No.
### What problem does this PR solve? Issue Number: None Related PR: apache#63832 Problem Summary: The table-level warm-up table filter performance tests used tight wall-clock thresholds for the 200K and 500K wildcard match-all cases. CI machines can run these scale tests slightly slower than local runs even though the matching implementation remains efficient. Relax the 200K threshold from 1s to 1.5s and the 500K threshold from 2s to 3s while keeping the existing functional assertions and smaller or more selective performance checks. ### Release note None ### Check List (For Author) - Test: - Unit Test: ./run-fe-ut.sh --run org.apache.doris.cloud.CacheHotspotManagerTableFilterTest - Behavior changed: No. - Does this need documentation: No.
### What problem does this PR solve? Issue Number: None Related PR: apache#63832 Problem Summary: The table-level warm-up table filter performance test for 200K tables with 15 include/exclude rules still used a tight 2s wall-clock threshold. CI can exceed that threshold under load while the matcher remains functionally correct. Relax the threshold to 3s and keep the matched-table assertion unchanged. ### Release note None ### Check List (For Author) - Test: - Unit Test: ./run-fe-ut.sh --run org.apache.doris.cloud.CacheHotspotManagerTableFilterTest - Behavior changed: No. - Does this need documentation: No.
a67fe97 to
44f6b85
Compare
| static constexpr int WINDOW_30M = 1800; | ||
| static constexpr int WINDOW_1H = 3600; | ||
|
|
||
| MBvarWindowedAdder g_warmup_ed_finish_segment_num("warmup_ed_finish_segment_num", {"job_id"}, |
There was a problem hiding this comment.
is there any memory issues if there are many jobs.
how does bvar implement "windows", does it recored every smaples of the adder every second?
There was a problem hiding this comment.
I checked the bvar implementation again.
bvar::Window does not record every update written to the Adder. For bvar::Adder, the underlying sampler samples the cumulative adder value roughly once per second, and the window value is calculated from the difference between the latest sampled cumulative value and the oldest sampled cumulative value in the requested window.
The 5m/30m/1h windows created for the same Adder also share the same underlying sampler. The sampler queue is sized by the largest window, so here it keeps about 3600 + 1 samples, not 300 + 1800 + 3600 samples and not one sample per warm-up event.
Rough estimate:
- One
Sample<int64_t>storesdataandtime_us, so it is about 16 bytes. - The largest window is 1h, so one sampler queue is about
(3600 + 1) * 16 ~= 56KB. - Source-side stats have 4 windowed adders, about
4 * 56KB ~= 224KB/jobfor sampler queues. - Target-side stats have 8 windowed adders, about
8 * 56KB ~= 448KB/jobfor sampler queues. - If the same BE process observes both sides, the sampler queue storage is roughly
(4 + 8) * 56KB ~= 672KB/job, plus small object/map/string overhead.
So this is proportional to the number of job_id dimensions seen by a BE process, not proportional to the number of rowsets/segments/events. The overall memory usage should be small for the expected number of warm-up jobs. This state is also BE-process-local memory only; it is not persisted and will be released after BE restart.
TPC-H: Total hot run time: 31398 ms |
TPC-H: Total hot run time: 31974 ms |
TPC-DS: Total hot run time: 172895 ms |
TPC-DS: Total hot run time: 171939 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
### What problem does this PR solve? Issue Number: None Related PR: apache#63832 Problem Summary: The aggregated warm-up rowset failure message included the tablet id and rowset id but omitted the table id, making table-level event-driven warm-up failures harder to diagnose. Pass table_id into the aggregated failure builder and include it in the error text. Extend the helper unit tests to assert the table id is reported. ### Release note None ### Check List (For Author) - Test: - Unit Test: ./run-be-ut.sh --run --filter=CloudWarmUpManagerFilterTest.* -j100 - Behavior changed: No. - Does this need documentation: No.
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
### What problem does this PR solve? Issue Number: None Related PR: apache#63832 Problem Summary: The 500K table-filter performance unit test can exceed the previous 3s threshold under CI load even though the matcher behavior remains correct. Relax the assertion to 4s to avoid treating small runtime variance as a test failure. ### Release note None ### Check List (For Author) - Test: - Unit Test: ./run-fe-ut.sh --run org.apache.doris.cloud.CacheHotspotManagerTableFilterTest - Behavior changed: No. - Does this need documentation: No.
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 32235 ms |
TPC-DS: Total hot run time: 172315 ms |
|
run cloud_p0 |
|
/review |
There was a problem hiding this comment.
I found two correctness issues in the table-level event-driven warm-up flow. The new ON TABLES semantics can be bypassed during rolling upgrades because old source BEs ignore the optional table_ids thrift field and treat the job as cluster-level, and recycle-cache events currently pass table_id=0 so they bypass the same filter even on upgraded BEs.
Critical checkpoint conclusions:
- Goal/tests: The PR implements table-filtered event-driven cloud warm-up with stats and broad unit/regression coverage, but the two paths below still violate the table filter contract.
- Scope: The implementation is focused on warm-up filtering/stats, but it touches FE parser/job lifecycle, BE warm-up dispatch, thrift/proto, metrics, and tests.
- Concurrency/lifecycle: Existing job daemons, BE warm-up thread pool, and BE metric globals are involved. I did not find a new lock-order deadlock, but the recycle path intentionally disables filtering.
- Configs: New refresh/display configs are mutable and observed by the relevant daemons/display logic.
- Compatibility: The new optional FE-to-BE table_ids field needs explicit mixed-version handling before table-level jobs can be safely created in rolling upgrades.
- Parallel paths: warm_up_rowset applies table_id filtering, but recycle_cache does not.
- Tests: Coverage is extensive, but it does not cover mixed-version behavior or recycle-cache filtering for unmatched tables.
- Observability: New stats/logging are generally sufficient; existing prior comments already cover bvar memory/naming and table_id logging.
User focus: No additional user-provided review focus was specified.
| @@ -802,18 +929,18 @@ void CloudWarmUpManager::_recycle_cache(int64_t tablet_id, | |||
| auto dns_cache = ExecEnv::GetInstance()->dns_cache(); | |||
There was a problem hiding this comment.
Passing table_id=0 disables the new table-level filter for recycle-cache events. When rowsets from an unmatched table are GC'd or compacted on the source, every table-level warm-up job can still receive PRecycleCacheRequest and evict target cache for tables outside its ON TABLES filter. The callers are CloudTablet methods and can use the tablet's table id, so please propagate it through recycle_cache/_recycle_cache and apply the same filtering as warm_up_rowset.
FE Regression Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 32281 ms |
TPC-DS: Total hot run time: 171370 ms |
What problem does this PR solve?
Issue Number: None
Problem Summary:
This PR adds table-level event-driven cloud warm-up support and improves active incremental warm-up progress observability.
Before this change, event-driven warm-up was only controlled at compute-group granularity. Once a load-event warm-up job was enabled for a source and target compute group pair, all source-side table writes could trigger warm-up to the target compute group. That is inefficient for workloads where only selected core tables, high-frequency query tables, or selected async materialized views need to stay warm.
This PR lets users define the warm-up scope with
ON TABLESwhen creating an event-driven load warm-up job. FE persists the normalized table filter in the warm-up job, resolves matched table ids dynamically, sends the table ids to BE, and lets BE filter warm-up rowsets by table id.User-visible behavior:
WARM UP ... ON TABLESsupports table-level event-driven warm-up.INCLUDEandEXCLUDErules.*and?wildcards, for exampledb.table,db.*,*.orders_*, andlog_db.log_?.INCLUDEdefines the candidate warm-up scope, andEXCLUDEremoves tables from that included scope.SHOW WARM UP JOBexposes the table-level job type, table filter, matched tables, and SyncStats.SHOW WARM UP JOBlist output keeps compact SyncStats, while single-job lookup keeps detailed windowed SyncStats.Example:
Conflict and virtual compute group behavior:
Warm-up progress observation:
/api/warmup_event_driven_stats./metricsexposes per-job active warm-up metadata, synchronized size, and trigger gap metrics for cloud event-driven warm-up jobs.Release note
Support table-level event-driven cloud warm-up with
ON TABLESfilters and per-job warm-up sync statistics.Check List (For Author)
Test
Behavior changed:
WARM UPsupports table-levelON TABLESfilters for event-driven load warm-up, and warm-up job output/metrics expose table filter, matched tables, SyncStats, and trigger-gap information.Does this need documentation?
Check List (For Reviewer who merge this PR)