Skip to content

Parquet add no convert marker#7625

Open
siddarth2810 wants to merge 3 commits into
cortexproject:masterfrom
siddarth2810:parquet-add-no-convert-marker
Open

Parquet add no convert marker#7625
siddarth2810 wants to merge 3 commits into
cortexproject:masterfrom
siddarth2810:parquet-add-no-convert-marker

Conversation

@siddarth2810

@siddarth2810 siddarth2810 commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

What this PR does:
If a TSDB block exceeds a configurable threshold of distinct label names, the converter writes a parquet-no-convert-mark.json marker and skips the block.

  • Added no-convert marker with read/write logic
  • Added parquet-converter.max-block-label-names limit

Which issue(s) this PR fixes:
Fixes #7195

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
  • docs/configuration/v1-guarantees.md updated if this PR introduces experimental flags

Notes from the Previous PR review

  • Removed the incorrect log under cortex_parquet.ValidConverterMarkVersion
  • Embedded LabelNamesCount and Threshold into Reason field
  • Added buffer for system columns and generated data columns
  • Cleaned up skipped-block metrics when deleting tenant metrics

Changes:
- Add parquet no-convert marker and read/write logic
- Add max-block-label-names limit, blocks exceeding it get a no-convert marker instead of being converted.
- Add parquet_converter_max_block_label_names to exporter test
- Add integration test for parquet no-convert marker

Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
The converter only read no-convert markers when the label-name limit was
enabled, so manually marked blocks were still converted when the limit was 0.
Read the marker unconditionally before conversion so these blocks stay skipped.

Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
@siddarth2810 siddarth2810 marked this pull request as ready for review June 16, 2026 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Parquet] Stop converting TSDB block to parquet if it has too many labels

1 participant