[WIP] columnar: Producer-Consumer Pipeline Read Model by JaySon-Huang · Pull Request #10904 · pingcap/tiflash

JaySon-Huang · 2026-06-16T14:20:26Z

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

What is changed and how it works?

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

None

Summary by CodeRabbit

New Features
- Implemented next-generation columnar read pipeline with improved IO/CPU separation and task scheduling.
- Added IO seek operation tracking for performance monitoring.
Documentation
- Added design documentation for columnar pipeline producer-consumer architecture.
Chores
- Updated build configuration to include new columnar storage components.
- Updated Docker image version tags in test utilities.

Signed-off-by: JaySon-Huang <tshent@qq.com>

ti-chi-bot · 2026-06-16T14:20:35Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign breezewish for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-06-16T14:21:02Z

📝 Walkthrough

Walkthrough

Introduces a producer-consumer pipeline for disaggregated columnar reads under ENABLE_NEXT_GEN_COLUMNAR. A new ColumnarReadSourceOp IO producer materializes columnar readers and serializes blocks into a SharedQueue; RNColumnarSourceOp consumes that queue on the CPU side. Reader prefetch switches from a detached background thread to PrefetchColumnarReaderTask submitted to the pipeline IO pool. StorageDisaggregatedColumnar wiring, shared notification types, IO seek counters, tests, and a design document are also added.

Changes

Columnar Producer-Consumer Pipeline

Layer / File(s)	Summary
Shared data contracts and pipeline notification types `dbms/src/Storages/StorageDisaggregatedColumnar.h`, `dbms/src/Storages/Columnar/ColumnarReadSourceOp.h`	Adds `RNColumnarReaderNotifyFuture` adapter, extends `RNColumnarReaderWork` with `notify_future`, adds `tryAcquireReaderWork(bool)`, `startAsyncMaterializeReader`, `tryGetReadyReader`, and `setPipelineExecutorContext` to `RNColumnarReadTask`, adds `createWithReader` factory to `RNColumnarInputStream`, and defines `ColumnarReadSourceState` enum with `ColumnarReadSourceOp` class declaration and private members.
PrefetchColumnarReaderTask: IO-pool async reader materialization `dbms/src/Storages/Columnar/PrefetchColumnarReaderTask.h`, `dbms/src/Storages/Columnar/PrefetchColumnarReaderTask.cpp`	Constructor stores `read_task` and `reader_work`; `executeImpl` throws `LOGICAL_ERROR` to enforce IO-only routing; `executeIOImpl` calls `createColumnarReaderWithBackoff`, transitions `reader_work` state to `Ready` or `Failed` under mutex, then notifies both `cv` and `notify_future` waiters; `finalizeImpl` is a no-op.
ColumnarReadSourceOp: IO producer state machine `dbms/src/Storages/Columnar/ColumnarReadSourceOp.cpp`	Implements prefix/suffix lifecycle methods; `readImpl` dispatches on state (`DONE`, `READY_BLOCK`, else `awaitImpl`); `consumeReadyReader` wraps a `ColumnarReaderPtr` into `RNColumnarInputStream`; `awaitImpl` handles `NEED_READER`/`WAIT_READER` transitions with mutex-guarded `RNColumnarReaderMaterializeState` checks; `executeIOImpl` materializes readers and reads blocks from `current_input_stream`.
RNColumnarSourceOp: SharedQueue CPU consumer `dbms/src/Storages/Columnar/ColumnarSourceOp.h`, `dbms/src/Storages/Columnar/ColumnarSourceOp.cpp`	Declares `RNColumnarSourceOp` with `Options` struct (exec context, req id, header, shared queue); `readImpl` maps `tryPop` outcomes to `WAIT_FOR_NOTIFY`, `HAS_OUTPUT`, or `CANCELLED` operator statuses.
StorageDisaggregatedColumnar pipeline wiring and reader-work management `dbms/src/Storages/StorageDisaggregatedColumnar.cpp`	Rewrites `readThroughColumnar` to build an IO producer group (`ColumnarReadSourceOp` × N + `SharedQueueSinkOp`) and a CPU consumer group (`RNColumnarSourceOp`), or injects `NullSourceOp` for empty ranges; replaces detached thread prefetch with `PrefetchColumnarReaderTask` submission; adds `tryGetReadyReader`, `startAsyncMaterializeReader`, `tryAcquireReaderWork(bool)`; adds `RNColumnarInputStream::createWithReader`; removes old `RNColumnarSourceOp` implementation.
dm_io_seek_count instrumentation `dbms/src/Storages/DeltaMerge/ScanContext.h`, `dbms/src/Storages/DeltaMerge/ScanContext.cpp`, `dbms/src/Storages/DeltaMerge/File/DMFileReader.cpp`	Adds `std::atomic<uint64_t> dm_io_seek_count` to `ScanContext`, increments it at each substream seek in `DMFileReader::readFromDisk`, adds a debug log in `readImpl`, and exposes the counter in `toJson()`.
Unit tests, design document, and auxiliary updates `dbms/src/Storages/tests/gtest_storage_disaggregated_columnar.cpp`, `docs/design/2026-06-13-columnar-pipeline-producer-consumer-model.md`, `dbms/CMakeLists.txt`, `tests/docker/next-gen-utils/Makefile`, `.gitignore`	Adds gtests covering `RNColumnarReaderWork` init state, null-source profile recording, `RNColumnarSourceOp` queue reads, and `WAIT_FOR_NOTIFY`/`CANCELLED` transitions; adds a design document for the producer-consumer model; registers `dbms/src/Storages/Columnar` in CMake; updates Docker image tags from `next-gen` to `nextgen`; adds `.gitignore` entry.

Sequence Diagram(s)

sequenceDiagram
  participant PipelineBuilder as StorageDisaggregated
  participant ColumnarReadSourceOp as ColumnarReadSourceOp (IO)
  participant PrefetchColumnarReaderTask as PrefetchTask (IO pool)
  participant RNColumnarReaderWork as ReaderWork
  participant SharedQueue as SharedQueue
  participant RNColumnarSourceOp as RNColumnarSourceOp (CPU)

  rect rgba(100, 150, 220, 0.5)
    note over PipelineBuilder: Pipeline construction
    PipelineBuilder->>ColumnarReadSourceOp: add N producers
    PipelineBuilder->>SharedQueue: create bounded queue
    PipelineBuilder->>RNColumnarSourceOp: add consumer
  end

  rect rgba(220, 140, 60, 0.5)
    note over ColumnarReadSourceOp: executeIOImpl — reader materialization
    ColumnarReadSourceOp->>PrefetchColumnarReaderTask: submit to IO pool (startAsyncMaterializeReader)
    PrefetchColumnarReaderTask->>RNColumnarReaderWork: createColumnarReaderWithBackoff → state=Ready
    PrefetchColumnarReaderTask->>RNColumnarReaderWork: notify_future.notifyAll()
    ColumnarReadSourceOp->>RNColumnarReaderWork: awaitImpl sees Ready, consumeReadyReader
    ColumnarReadSourceOp->>SharedQueue: push blocks via SharedQueueSinkOp
  end

  rect rgba(60, 180, 100, 0.5)
    note over RNColumnarSourceOp: readImpl — queue consumption
    RNColumnarSourceOp->>SharedQueue: tryPop(block)
    alt READY
      SharedQueue-->>RNColumnarSourceOp: HAS_OUTPUT
    else EMPTY
      SharedQueue-->>RNColumnarSourceOp: WAIT_FOR_NOTIFY
    else FINISHED
      SharedQueue-->>RNColumnarSourceOp: HAS_OUTPUT (empty block = EOF)
    end
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

JinheLin
Lloyd-Pottiger
yongman

Poem

🐇 Hop, hop — no more detached threads to chase,
The IO pool now sets the reading pace.
A queue sits snug between producer and consumer's paw,
Backpressure balanced, no lost-wakeup flaw.
WAIT_FOR_NOTIFY hops safely to HAS_OUTPUT's grace —
This rabbit approves the columnar pipeline's new face! 🎉

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is essentially a blank template with no substantive content: issue/problem statement missing, implementation details empty, test approach unspecified, and side effects/documentation unmarked.	Complete all required sections: provide issue number, explain the problem being solved, describe what changed and why, specify test coverage, document side effects and impacts, and provide a meaningful release note.
Docstring Coverage	⚠️ Warning	Docstring coverage is 11.29% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title clearly describes the main change: implementing a producer-consumer pipeline read model for columnar functionality, which matches the substantial code additions across multiple files.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ti-chi-bot · 2026-06-16T14:24:42Z

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-linked-issue label, please provide the linked issue number on one line in the PR body, for example: Issue Number: close #123 or Issue Number: ref #456.

_{📖 For more info, you can check the "Contribute Code" section in the development guide.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

dbms/src/Storages/StorageDisaggregatedColumnar.cpp (1)

1231-1247: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Do not transition to Creating when no pipeline executor is available.

Line 1235 sets reader_work->state = Creating before Line 1246 checks exec_context. In stream mode (exec_context == nullptr), this leaves the work stuck in Creating with no scheduled task, and subsequent getOrCreateReader() can block forever.

Proposed fix

 void RNColumnarReadTask::prefetchReaderWork(const RNColumnarReaderWorkPtr & reader_work)
 {
     RUNTIME_CHECK(reader_work != nullptr);

+    // Stream path has no pipeline scheduler; keep work in NotStarted so inline creation can proceed.
+    if (exec_context == nullptr)
+        return;
+
     {
         auto guard = std::lock_guard(reader_work->mutex);
         if (reader_work->state != RNColumnarReaderMaterializeState::NotStarted)
             return;
         reader_work->state = RNColumnarReaderMaterializeState::Creating;
     }

     const auto region_id = reader_work->plan.region_id;

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Storages/StorageDisaggregatedColumnar.cpp` around lines 1231 - 1247,
The state transition to Creating occurs before checking whether exec_context is
available. In stream mode where exec_context is nullptr, this leaves the work
stuck in Creating state with no scheduled task to complete it. Move the
exec_context nullptr check to occur before the state transition to Creating
(before line 1235 where reader_work->state =
RNColumnarReaderMaterializeState::Creating is set), so that the function returns
early without changing state when there is no pipeline executor available.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dbms/src/Storages/Columnar/ColumnarReadSourceOp.cpp`:
- Around line 176-190: The issue is that both NotStarted and Creating states are
triggering inline materialization, causing concurrent reader creation in
prefetch tasks and source IO paths. Fix this by only materializing when the
state is NotStarted, not when it's already Creating (which indicates
materialization is already in progress). In the switch statement around line 177
in ColumnarReadSourceOp.cpp, remove the Creating case from triggering
should_materialize or only set should_materialize = true for the NotStarted
case. Apply the same logic fix at the sibling location around lines 227-236 to
ensure Creating state is not treated as a trigger for inline materialization at
any point in the code.

In `@dbms/src/Storages/DeltaMerge/ScanContext.h`:
- Line 51: The dm_io_seek_count atomic counter member variable is defined in the
ScanContext class but is not integrated into the serialization and aggregation
operations, causing loss of I/O seek instrumentation data in distributed
queries. Add handling for dm_io_seek_count in the deserialize() method (around
line 180) to read the value from the tipb::TiFlashScanContext protobuf message,
in the serialize() method (around line 269) to write the value to the protobuf
message, in the merge(const ScanContext&) overload (around line 359) to
aggregate counters from another ScanContext instance, and in the merge(const
tipb::TiFlashScanContext&) overload (around line 454) to aggregate from a
protobuf message. Additionally, verify that the protobuf definition for
tipb::TiFlashScanContext includes a dm_io_seek_count field; if it does not
exist, update the .proto file to add this field before implementing the C++
changes.

---

Outside diff comments:
In `@dbms/src/Storages/StorageDisaggregatedColumnar.cpp`:
- Around line 1231-1247: The state transition to Creating occurs before checking
whether exec_context is available. In stream mode where exec_context is nullptr,
this leaves the work stuck in Creating state with no scheduled task to complete
it. Move the exec_context nullptr check to occur before the state transition to
Creating (before line 1235 where reader_work->state =
RNColumnarReaderMaterializeState::Creating is set), so that the function returns
early without changing state when there is no pipeline executor available.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cdb4af6f-e15d-48cc-8056-f6c6a72161f6

📥 Commits

Reviewing files that changed from the base of the PR and between 044990a and bb2b137.

📒 Files selected for processing (16)

.gitignore
dbms/CMakeLists.txt
dbms/src/Storages/Columnar/ColumnarReadSourceOp.cpp
dbms/src/Storages/Columnar/ColumnarReadSourceOp.h
dbms/src/Storages/Columnar/ColumnarSourceOp.cpp
dbms/src/Storages/Columnar/ColumnarSourceOp.h
dbms/src/Storages/Columnar/PrefetchColumnarReaderTask.cpp
dbms/src/Storages/Columnar/PrefetchColumnarReaderTask.h
dbms/src/Storages/DeltaMerge/File/DMFileReader.cpp
dbms/src/Storages/DeltaMerge/ScanContext.cpp
dbms/src/Storages/DeltaMerge/ScanContext.h
dbms/src/Storages/StorageDisaggregatedColumnar.cpp
dbms/src/Storages/StorageDisaggregatedColumnar.h
dbms/src/Storages/tests/gtest_storage_disaggregated_columnar.cpp
docs/design/2026-06-13-columnar-pipeline-producer-consumer-model.md
tests/docker/next-gen-utils/Makefile

coderabbitai · 2026-06-16T14:29:42Z

+            case RNColumnarReaderMaterializeState::NotStarted:
+            case RNColumnarReaderMaterializeState::Creating:
+                current_reader_work->state = RNColumnarReaderMaterializeState::Creating;
+                should_materialize = true;
+                break;
+            }
+        }
+
+        if (taken_reader.has_value())
+        {
+            consumeReadyReader(std::move(taken_reader.value()));
+            return OperatorStatus::IO_IN;
+        }
+        if (should_materialize)
+            return OperatorStatus::IO_IN;


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Avoid double materialization when a work item is already Creating.

In Line 177 and Line 227, Creating is handled as “materialize inline”, so a prefetched work can be created twice concurrently (prefetch task + source IO path). This duplicates remote reader creation and can drop the losing reader result.

Suggested direction

// In awaitImpl(), NEED_READER branch: - case RNColumnarReaderMaterializeState::NotStarted: - case RNColumnarReaderMaterializeState::Creating: - current_reader_work->state = RNColumnarReaderMaterializeState::Creating; - should_materialize = true; - break; + case RNColumnarReaderMaterializeState::NotStarted: + current_reader_work->state = RNColumnarReaderMaterializeState::Creating; + should_materialize = true; + break; + case RNColumnarReaderMaterializeState::Creating: + state = ColumnarReadSourceState::WAIT_READER; + setNotifyFuture(&current_reader_work->notify_future); + return OperatorStatus::WAIT_FOR_NOTIFY;

// In executeIOImpl(), NEED_READER/WAIT_READER branch: // only inline-create when this operator owns the NotStarted -> Creating transition.

Also applies to: 227-236

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dbms/src/Storages/Columnar/ColumnarReadSourceOp.cpp` around lines 176 - 190, The issue is that both NotStarted and Creating states are triggering inline materialization, causing concurrent reader creation in prefetch tasks and source IO paths. Fix this by only materializing when the state is NotStarted, not when it's already Creating (which indicates materialization is already in progress). In the switch statement around line 177 in ColumnarReadSourceOp.cpp, remove the Creating case from triggering should_materialize or only set should_materialize = true for the NotStarted case. Apply the same logic fix at the sibling location around lines 227-236 to ensure Creating state is not treated as a trigger for inline materialization at any point in the code.

coderabbitai · 2026-06-16T14:29:42Z

    std::atomic<uint64_t> dmfile_lm_filter_scanned_rows{0};
    std::atomic<uint64_t> dmfile_lm_filter_skipped_rows{0};
    std::atomic<uint64_t> total_dmfile_read_time_ns{0};
+    std::atomic<uint64_t> dm_io_seek_count{0};


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify protobuf contract for dm_io_seek_count # Look for the .proto definition file and check if dm_io_seek_count field exists # Search for .proto files that define TiFlashScanContext find . -name "*.proto" -type f | xargs grep -l "TiFlashScanContext" 2>/dev/null # Then check if dm_io_seek_count is defined in those files fd -e proto | xargs grep -n "dm_io_seek_count" 2>/dev/null

Repository: pingcap/tiflash

Length of output: 43

🏁 Script executed:

#!/bin/bash # First, locate the ScanContext files find . -name "ScanContext.*" -type f

Repository: pingcap/tiflash

Length of output: 151

🏁 Script executed:

#!/bin/bash # Find .proto files that might define TiFlashScanContext find . -name "*.proto" -type f | head -20

Repository: pingcap/tiflash

Length of output: 1045

🏁 Script executed:

#!/bin/bash # Search for TiFlashScanContext in the entire codebase rg "TiFlashScanContext" --type cpp --type h -l

Repository: pingcap/tiflash

Length of output: 147

🏁 Script executed:

#!/bin/bash # Search for TiFlashScanContext in .proto files rg "TiFlashScanContext" --type proto

Repository: pingcap/tiflash

Length of output: 89

🏁 Script executed:

#!/bin/bash # Read the ScanContext.h file to check line 51 and the serialize/deserialize/merge methods wc -l ./dbms/src/Storages/DeltaMerge/ScanContext.h

Repository: pingcap/tiflash

Length of output: 108

🏁 Script executed:

#!/bin/bash # Look for proto files that might contain ScanContext find . -name "*.proto" -type f -exec grep -l "ScanContext\|scan_context" {} \;

Repository: pingcap/tiflash

Length of output: 41

🏁 Script executed:

#!/bin/bash # Read the ScanContext.h file - focus on the key sections cat -n ./dbms/src/Storages/DeltaMerge/ScanContext.h | head -100

Repository: pingcap/tiflash

Length of output: 4890

🏁 Script executed:

#!/bin/bash # Check around line 51 where the new member is added cat -n ./dbms/src/Storages/DeltaMerge/ScanContext.h | sed -n '40,70p'

Repository: pingcap/tiflash

Length of output: 1742

🏁 Script executed:

#!/bin/bash # Check the serialize/deserialize methods around the mentioned lines cat -n ./dbms/src/Storages/DeltaMerge/ScanContext.h | sed -n '170,190p'

Repository: pingcap/tiflash

Length of output: 1404

🏁 Script executed:

#!/bin/bash # Check the serialize method around line 269 cat -n ./dbms/src/Storages/DeltaMerge/ScanContext.h | sed -n '260,290p'

Repository: pingcap/tiflash

Length of output: 2579

🏁 Script executed:

#!/bin/bash # Check the merge methods around lines 359 and 454 cat -n ./dbms/src/Storages/DeltaMerge/ScanContext.h | sed -n '350,370p'

Repository: pingcap/tiflash

Length of output: 1460

🏁 Script executed:

#!/bin/bash # Check the second merge method around line 454 cat -n ./dbms/src/Storages/DeltaMerge/ScanContext.h | sed -n '445,465p'

Repository: pingcap/tiflash

Length of output: 1547

🏁 Script executed:

#!/bin/bash # Search for protobuf definitions in tipb directory find . -name "*.proto" -path "*/tipb/*" -type f

Repository: pingcap/tiflash

Length of output: 41

🏁 Script executed:

#!/bin/bash # Search for tipb proto files more broadly find . -name "executor.proto" -o -name "tipb.proto" -o -name "*tipb*.proto" 2>/dev/null | head -20

Repository: pingcap/tiflash

Length of output: 41

🏁 Script executed:

#!/bin/bash # Check if tipb is an external dependency - search for it in includes grep -r "tipb/executor" --include="*.h" --include="*.hpp" | head -5

Repository: pingcap/tiflash

Length of output: 442

🏁 Script executed:

#!/bin/bash # Check the entire deserialize method to see all fields being deserialized cat -n ./dbms/src/Storages/DeltaMerge/ScanContext.h | sed -n '180,268p'

Repository: pingcap/tiflash

Length of output: 7267

🏁 Script executed:

#!/bin/bash # Check the entire serialize method to see all fields being serialized cat -n ./dbms/src/Storages/DeltaMerge/ScanContext.h | sed -n '269,357p'

Repository: pingcap/tiflash

Length of output: 7396

🏁 Script executed:

#!/bin/bash # Get the complete merge(const ScanContext&) method cat -n ./dbms/src/Storages/DeltaMerge/ScanContext.h | sed -n '359,453p'

Repository: pingcap/tiflash

Length of output: 5988

🏁 Script executed:

#!/bin/bash # Get the complete merge(const tipb::TiFlashScanContext&) method cat -n ./dbms/src/Storages/DeltaMerge/ScanContext.h | sed -n '454,584p'

Repository: pingcap/tiflash

Length of output: 8409

dm_io_seek_count member not wired into serialize/deserialize/merge operations.

The new dm_io_seek_count counter added to the header is missing from deserialize(), serialize(), and both merge() methods. This prevents the counter from being transmitted between instances, aggregated in distributed queries, or restored from remote snapshots, resulting in loss of I/O seek instrumentation data.

Add dm_io_seek_count handling to:

deserialize() (line 180): deserialize from tipb::TiFlashScanContext

serialize() (line 269): serialize to tipb::TiFlashScanContext

merge(const ScanContext&) (line 359): aggregate the counter

merge(const tipb::TiFlashScanContext&) (line 454): aggregate from proto

Also verify that the protobuf definition (tipb::TiFlashScanContext) includes a dm_io_seek_count field. If not, the .proto file must be updated.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dbms/src/Storages/DeltaMerge/ScanContext.h` at line 51, The dm_io_seek_count atomic counter member variable is defined in the ScanContext class but is not integrated into the serialization and aggregation operations, causing loss of I/O seek instrumentation data in distributed queries. Add handling for dm_io_seek_count in the deserialize() method (around line 180) to read the value from the tipb::TiFlashScanContext protobuf message, in the serialize() method (around line 269) to write the value to the protobuf message, in the merge(const ScanContext&) overload (around line 359) to aggregate counters from another ScanContext instance, and in the merge(const tipb::TiFlashScanContext&) overload (around line 454) to aggregate from a protobuf message. Additionally, verify that the protobuf definition for tipb::TiFlashScanContext includes a dm_io_seek_count field; if it does not exist, update the .proto file to add this field before implementing the C++ changes.

ti-chi-bot · 2026-06-16T15:35:58Z

@JaySon-Huang: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-unit-next-gen	`bb2b137`	link	true	`/test pull-unit-next-gen`
pull-integration-next-gen-columnar	`bb2b137`	link	true	`/test pull-integration-next-gen-columnar`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

columnar: Producer-Consumer Pipeline Read Model

bb2b137

Signed-off-by: JaySon-Huang <tshent@qq.com>

ti-chi-bot Bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jun 16, 2026

ti-chi-bot Bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jun 16, 2026

coderabbitai Bot reviewed Jun 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] columnar: Producer-Consumer Pipeline Read Model#10904

[WIP] columnar: Producer-Consumer Pipeline Read Model#10904
JaySon-Huang wants to merge 1 commit into
pingcap:masterfrom
JaySon-Huang:pipeline_col

JaySon-Huang commented Jun 16, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

ti-chi-bot Bot commented Jun 16, 2026

Uh oh!

coderabbitai Bot commented Jun 16, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (2 warnings)

Uh oh!

ti-chi-bot Bot commented Jun 16, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 16, 2026

Uh oh!

coderabbitai Bot Jun 16, 2026

Uh oh!

ti-chi-bot Bot commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JaySon-Huang commented Jun 16, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What is changed and how it works?

Check List

Release note

Summary by CodeRabbit

Uh oh!

ti-chi-bot Bot commented Jun 16, 2026

Uh oh!

coderabbitai Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (2 warnings)

Uh oh!

ti-chi-bot Bot commented Jun 16, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

ti-chi-bot Bot commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JaySon-Huang commented Jun 16, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 16, 2026 •

edited

Loading