feat: clustered segments pt.2 (write side support) by clintropolis · Pull Request #19579 · apache/druid

clintropolis · 2026-06-13T18:50:21Z

Description

Follow-up to #19460, this PR introduces the writer side stuff so that the segments can actually be created.

As part of this I'm experimenting with a new V10 oriented way to express DataSchema, where the ingest spec looks a lot more like how the V10 segment metadata is organized, as 1 or more projections. Using the wikipedia example, a clustered segment ingest spec looks something like this:

    "dataSchema": {
      "dataSource": "wikipedia_clustered",
      "segmentGranularitySpec": {
        "segmentGranularity": "day"
      },
      "timestampSpec": {
        "column": "timestamp",
        "format": "iso"
      },
      "baseTable": {
        "type": "clusteredValueGroups",
        "virtualColumns": [
          {
            "type": "expression",
            "name": "__virtualGranularity",
            "expression": "timestamp_floor(__time,'PT1H')",
            "outputType": "LONG"
          }
        ],
        "clusteringColumns": [
          "channel"
        ],
        "dimensions": [
          "page",
          "comment"
...
        ]
      }
    }

Like projections, it requires expressing the query granularity as a virtual column transformation of __time (though a bit more fragile than expressions since it requires the column be named __virtualGranularity right now, thinking about changing this to 'resolve' it like we do for projections, or something else, not sure yet). To support this, when using baseTable there is also a segmentGranularity field which splits out the segment granularity parts of the existing GranularitySpec. To be less disruptive for now, these can be computed into a GranularitySpec, but over time i'd like to migrate to this model of expressing the schema.

Includes the fix part o #19578 since this is the branch where I ran into that problem.

changes:

adds BaseTableProjectionSpec interface to capture the operator facing shape of V10 base table schemas
adds ClusteredValueGroupsBaseTableSchema implementation for ingesting clustered segments
adds DataSchema.baseTable, a BaseTableProjectionSpec which when set puts the DataSchema into a new mode where the majority of the schema is defined via the baseTable, rejecting other top level fields
adds DataSchema.segmentGranularity to use when baseTable is set, which captures the segment granularity and intervals (query granularity is defined by the baseTable)
adds AdaptedBaseTableProjectionSpec implementation for converting classic DataSchema fields to a BaseTableProjectionSpec
adds OnHeapClusteredBaseTable, OnHeapClusterGroup used by OnHeapIncrementalIndex to build clustered segments
adds IndexMergerV10.makeClusteredIndexFiles which merges and builds clustered v10 segments
Sink/BatchAppenderator/StreamAppenderator wiring for clustered segments so that the cluster group tuples appear on the DataSegment
known issues: unbounded, no aggregate projections, no compaction support, no time ordered cursor support

changes: * adds `BaseTableProjectionSpec` interface to capture the operator facing shape of V10 base table schemas * adds `ClusteredValueGroupsBaseTableSchema` implementation for ingesting clustered segments * adds `DataSchema.baseTable`, a `BaseTableProjectionSpec` which when set puts the DataSchema into a new mode where the majority of the schema is defined via the baseTable, rejecting other top level fields * adds `DataSchema.segmentGranularity` to use when `baseTable` is set, which captures the segment granularity and intervals (query granularity is defined by the baseTable) * adds `AdaptedBaseTableProjectionSpec` implementation for converting classic `DataSchema` fields to a `BaseTableProjectionSpec` * adds `OnHeapClusteredBaseTable`, `OnHeapClusterGroup` used by `OnHeapIncrementalIndex` to build clustered segments * adds `IndexMergerV10.makeClusteredIndexFiles` which merges and builds clustered v10 segments * `Sink`/`BatchAppenderator`/`StreamAppenderator` wiring for clustered segments so that the cluster group tuples appear on the `DataSegment` * known issues: unbounded, no aggregate projections, no compaction support, no time ordered cursor support

FrankChen021

Severity	Findings
P0	0
P1	0
P2	1
P3	0
Total	1

Reviewed 53 of 53 changed files.

This is an automated review by Codex GPT-5.5

FrankChen021 · 2026-06-15T12:22:12Z

+    this.metrics = metrics == null ? new AggregatorFactory[0] : metrics;
+    validateNoOverlap(this.clusteringColumns, this.nonClusteringDimensions);
+    this.dimensionsSpec = computeDimensionsSpec(this.clusteringColumns, this.nonClusteringDimensions);
+    this.ordering = declaredOrdering != null


[P2] Declared ordering can be advertised without being honored

The spec accepts any declaredOrdering here, but the clustered write path does not use it to build the actual per-group row comparator: OnHeapClusterGroup derives ordering from timePosition/DimensionsSpec, while IndexMergerV10 emits groups in ascending tuple order and then stores firstSchema.getOrdering() in metadata. A spec such as [tenant ASC, page ASC, __time ASC], or any DESC order, can therefore be persisted with advertised ordering that the data does not actually satisfy. Query engines may trust CursorHolder/DataSegment ordering and skip sorting, producing misordered results. Please reject unsupported ordering or wire the declared order through ingestion and merge.

ah this is fair, i should probably make it purely computed for now like AggregateProjectionSpec since its a bit of work to actually wire this up to honor it

…-writer-stuff

+  {
+    final int numClusteringColumns = clusteringColumns.size();
+    return OnheapIncrementalIndex.ROUGH_OVERHEAD_PER_MAP_ENTRY
+           + (long) numClusteringColumns * (Long.BYTES * 2 + Long.BYTES)


github-actions Bot added Area - Segment Format and Ser/De Area - Ingestion labels Jun 13, 2026

github-advanced-security AI found potential problems Jun 13, 2026

View reviewed changes

Comment thread processing/src/main/java/org/apache/druid/segment/incremental/OnHeapClusterGroup.java Fixed

clintropolis added 3 commits June 13, 2026 12:46

cleanup tests

6093a5d

fix test

24a46e8

tidy

d45fbd6

FrankChen021 reviewed Jun 15, 2026

View reviewed changes

clintropolis added 2 commits June 15, 2026 10:29

Merge remote-tracking branch 'upstream/master' into clustered-segment…

c47229b

…-writer-stuff

cleanup spec, drop metrics, tidy up

7a5397b

github-advanced-security AI found potential problems Jun 16, 2026

View reviewed changes

Comment thread processing/src/main/java/org/apache/druid/segment/incremental/OnHeapClusteredBaseTable.java

{

final int numClusteringColumns = clusteringColumns.size();

return OnheapIncrementalIndex.ROUGH_OVERHEAD_PER_MAP_ENTRY

+ (long) numClusteringColumns * (Long.BYTES * 2 + Long.BYTES)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: clustered segments pt.2 (write side support)#19579

feat: clustered segments pt.2 (write side support)#19579
clintropolis wants to merge 6 commits into
apache:masterfrom
clintropolis:clustered-segment-writer-stuff

clintropolis commented Jun 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

FrankChen021 left a comment

Uh oh!

FrankChen021 Jun 15, 2026

Uh oh!

clintropolis Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

clintropolis commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

Uh oh!

FrankChen021 left a comment

Choose a reason for hiding this comment

Uh oh!

FrankChen021 Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

clintropolis Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

clintropolis commented Jun 13, 2026 •

edited

Loading