From 91420dbb0519399564729f590f314c4111ae9930 Mon Sep 17 00:00:00 2001
From: Andy Grove <agrove@apache.org>
Date: Thu, 28 May 2026 08:12:31 -0600
Subject: [PATCH 1/3] docs: lead README with the Arrow-native framing

Rewrite the top two paragraphs of README.md so the value prop leads
with the Arrow-native pipeline (operators, expressions, shuffle, and
broadcast all in Apache Arrow columnar format) rather than 'native
Rust implementations'. The accelerator list grows by one entry to
mention the experimental Scala/Java UDF support; shuffle and 'What
Comet Accelerates' wording is tightened to match.

No other docs are touched in this PR. Contributor-guide and user-guide
prose updates for the same vocabulary clean-up (#4419) will follow
separately.
---
 README.md | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/README.md b/README.md
index cba865d96a..4827879671 100644
--- a/README.md
+++ b/README.md
@@ -35,10 +35,12 @@ under the License.
 
 <img src="docs/source/_static/images/DataFusionComet-Logo-Light.png" width="512" alt="logo"/>
 
-Apache DataFusion Comet is a high-performance accelerator for Apache Spark, built on top of the powerful
-[Apache DataFusion] query engine. Comet is designed to significantly enhance the
-performance of Apache Spark workloads while leveraging commodity hardware and seamlessly integrating with the
-Spark ecosystem without requiring any code changes.
+Apache DataFusion Comet is a high-performance accelerator for Apache Spark. Comet keeps Spark queries
+**Arrow-native end-to-end**: operators, expressions, shuffle, and broadcast all stay in Apache Arrow
+columnar format, avoiding the per-row overhead of Spark's row-based engine. Within the Arrow-native
+pipeline, operators and expressions execute as Rust code (via the [Apache DataFusion] query engine)
+or as JVM code that operates directly on Arrow batches. Comet integrates with the Spark ecosystem
+without requiring any code changes.
 
 **Comet provides a ~2x speedup for TPC-DS @ SF 1000 (1TB), resulting in ~50% cost savings.**
 
@@ -58,17 +60,22 @@ See the [Comet Benchmarking Guide](https://datafusion.apache.org/comet/contribut
 
 ## What Comet Accelerates
 
-Comet replaces Spark operators and expressions with native Rust implementations that run on Apache DataFusion.
-It uses Apache Arrow for zero-copy data transfer between the JVM and native code.
+Comet replaces Spark operators and expressions with implementations that consume and produce Apache Arrow
+batches. Most run as native Rust code on top of Apache DataFusion; some run as JVM code over Arrow batches.
+Either way the work stays in the Comet pipeline without falling back to Spark's row-based engine.
 
 - **Parquet scans**: native Parquet reader integrated with Spark's query planner
 - **Apache Iceberg**: accelerated Parquet scans when reading Iceberg tables from Spark
   (see the [Iceberg guide](https://datafusion.apache.org/comet/user-guide/iceberg.html))
-- **Shuffle**: native columnar shuffle with support for hash and range partitioning
+- **Shuffle**: Arrow-IPC columnar shuffle with support for hash and range partitioning, in a native Rust
+  implementation paired with a JVM fallback for unsupported partition key types
 - **Expressions**: hundreds of supported Spark expressions across math, string, datetime, array,
   map, JSON, hash, and predicate categories
 - **Aggregations**: hash aggregate with support for `FILTER (WHERE ...)` clauses
 - **Joins**: hash join, sort-merge join, and broadcast join
+- **Scala/Java UDFs**: experimental support for keeping Scala/Java scalar UDFs in the Comet pipeline
+  via Spark's whole-stage codegen (see the
+  [Scala UDF guide](https://datafusion.apache.org/comet/user-guide/scala_java_udfs.html))
 
 For the authoritative lists, see the [supported expressions](https://datafusion.apache.org/comet/user-guide/expressions.html)
 and [supported operators](https://datafusion.apache.org/comet/user-guide/operators.html) pages.

From 235f4a53d002c802c12fdb88c4520c2c12841148 Mon Sep 17 00:00:00 2001
From: Andy Grove <agrove@apache.org>
Date: Mon, 1 Jun 2026 08:05:47 -0600
Subject: [PATCH 2/3] Update README.md

Co-authored-by: Matt Butrovich <mbutrovich@users.noreply.github.com>
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 4827879671..a61e1332b7 100644
--- a/README.md
+++ b/README.md
@@ -62,7 +62,7 @@ See the [Comet Benchmarking Guide](https://datafusion.apache.org/comet/contribut
 
 Comet replaces Spark operators and expressions with implementations that consume and produce Apache Arrow
 batches. Most run as native Rust code on top of Apache DataFusion; some run as JVM code over Arrow batches.
-Either way the work stays in the Comet pipeline without falling back to Spark's row-based engine.
+Either way, query execution stays in the Comet pipeline without falling back to Spark's row-based engine.
 
 - **Parquet scans**: native Parquet reader integrated with Spark's query planner
 - **Apache Iceberg**: accelerated Parquet scans when reading Iceberg tables from Spark

From 7f03711c9d6c3223e4f3ba50272b943a83542d56 Mon Sep 17 00:00:00 2001
From: Andy Grove <agrove@apache.org>
Date: Mon, 1 Jun 2026 10:33:45 -0600
Subject: [PATCH 3/3] docs: drop 'experimental' from Scala/Java UDF bullet

#4514 (JVM Scala UDF codegen dispatch enabled by default) has merged, so the
support is no longer experimental.
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index a61e1332b7..c6b5bef01b 100644
--- a/README.md
+++ b/README.md
@@ -73,7 +73,7 @@ Either way, query execution stays in the Comet pipeline without falling back to
   map, JSON, hash, and predicate categories
 - **Aggregations**: hash aggregate with support for `FILTER (WHERE ...)` clauses
 - **Joins**: hash join, sort-merge join, and broadcast join
-- **Scala/Java UDFs**: experimental support for keeping Scala/Java scalar UDFs in the Comet pipeline
+- **Scala/Java UDFs**: support for keeping Scala/Java scalar UDFs in the Comet pipeline
   via Spark's whole-stage codegen (see the
   [Scala UDF guide](https://datafusion.apache.org/comet/user-guide/scala_java_udfs.html))