Skip to content

s3: document format=parquet option and page-level compression#2591

Open
rituparnakhaund wants to merge 1 commit into
fluent:masterfrom
rituparnakhaund:docs/s3-parquet-format-compression
Open

s3: document format=parquet option and page-level compression#2591
rituparnakhaund wants to merge 1 commit into
fluent:masterfrom
rituparnakhaund:docs/s3-parquet-format-compression

Conversation

@rituparnakhaund
Copy link
Copy Markdown

@rituparnakhaund rituparnakhaund commented May 29, 2026

Update S3 output plugin documentation to reflect the new format=parquet option that separates output format selection from byte-level compression.

Documents:

  • New parquet value for the format option
  • Page-level compression codec control via compression when format is parquet
  • Migration path from deprecated compression=parquet syntax
  • Configuration examples with and without page-level compression
  • Updated existing parquet examples to use new syntax

Related code PR: fluent/fluent-bit#11885

Summary by CodeRabbit

  • Documentation
    • Updated S3 output documentation with new compression options including zstd, arrow, and parquet.
    • Documented new format: parquet parameter for Apache Parquet output with codec mappings and requirements.
    • Clarified Arrow format guidance and deprecated compression=parquet syntax in favor of format: parquet.
    • Updated configuration examples to reflect new format and compression options.

Review Change Stack

Update S3 output plugin documentation to reflect the new format=parquet
option that separates output format selection from byte-level compression.

Documents:
- New parquet value for the format option
- Page-level compression codec control via compression when format is
  parquet
- Migration path from deprecated compression=parquet syntax
- Configuration examples with and without page-level compression
- Updated existing parquet examples to use new syntax

Related code PR: fluent/fluent-bit#11885

Signed-off-by: Rituparna Khaund <ritukhau@amazon.co.uk>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 95c3f588-d384-4667-815b-0e6ce4d04cb0

📥 Commits

Reviewing files that changed from the base of the PR and between 82ab6af and efda012.

📒 Files selected for processing (1)
  • pipeline/outputs/s3.md

📝 Walkthrough

Walkthrough

This PR updates the S3 output documentation to comprehensively describe Parquet format capabilities. The compression parameter table is expanded to list new values, followed by a detailed Parquet format section with codec mappings, configuration examples, migration guidance, and build requirements. Arrow format description is clarified, and test examples are updated to use the new format parameter syntax.

Changes

S3 Output Parquet Format Documentation

Layer / File(s) Summary
Compression and format parameter documentation
pipeline/outputs/s3.md
Parameter table updated to document compression support for zstd, arrow, and parquet, and format parameter now includes parquet with note that page-level codec is controlled by compression.
Parquet format section, examples, and format guidance
pipeline/outputs/s3.md
Introduces comprehensive Parquet format section documenting format parquet behavior, Parquet page codec mappings for compression values, use_put_object On requirement, YAML and fluent-bit.conf examples, migration from deprecated compression=parquet, and build requirements. Arrow description clarified to specify compression: arrow produces Apache Arrow (Feather) output. Test configuration examples updated to use format: parquet and compression: snappy.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

  • fluent/fluent-bit-docs#2111: Updates pipeline/outputs/s3.md with Parquet codec mapping, build requirements, and example configurations—directly overlapping with this PR's Parquet documentation.
  • fluent/fluent-bit-docs#2570: Modifies the same S3 output documentation around the format parameter, including Parquet and related format options.
  • fluent/fluent-bit-docs#2359: Updates pipeline/outputs/s3.md compression option documentation with Parquet/Arrow prerequisites.

Suggested labels

5.0

Suggested reviewers

  • eschabell
  • patrick-stephens

Poem

🐰 New formats bloom where data flows,
From Arrow wings to Parquet rows,
With compression maps and codec guides,
Our S3 docs now open wide! 📦✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: documenting the new format=parquet option and its page-level compression control, which aligns with the core objective of the pull request.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant