Skip to content

chore(audit): audit predicate expressions across Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1#4480

Merged
andygrove merged 1 commit into
apache:mainfrom
andygrove:worktree-audit-predicate-funcs
May 28, 2026
Merged

chore(audit): audit predicate expressions across Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1#4480
andygrove merged 1 commit into
apache:mainfrom
andygrove:worktree-audit-predicate-funcs

Conversation

@andygrove
Copy link
Copy Markdown
Member

Which issue does this PR close?

Closes #.

Rationale for this change

Continuation of the per-category expression audit. Same pattern as #4479 (bitwise), #4478 (map), #4476 (hash), #4475 (conditional), #4474 (misc), #4473 (collection), #4470 (json), #4469 (struct), using the updated audit-comet-expression skill in #4468.

What changes are included in this PR?

Support-doc audit notes

Add per-version audit sub-bullets to all 19 supported predicate SQL function names (!, <, <=, <=>, =, ==, >, >=, and, between, ilike, in, isnan, isnotnull, isnull, like, not, or, rlike).

The Spark expression classes are byte-for-byte identical across the four versions; only the NullIntolerant -> nullIntolerant trait refactor lands in Spark 4.0, with no runtime change. Highlights:

  • ! and == are registry aliases for Not and EqualTo.
  • between is rewritten by the parser to expr >= low AND expr <= high.
  • ilike is RuntimeReplaceable and rewrites to Like(Lower(left), Lower(right)).
  • like and rlike cross-reference the existing string-expressions audit (chore(audit): audit string expressions across Spark 3.4.3, 3.5.8, 4.0.1 #4461).
  • CometNot already optimizes a few special cases (Not(EqualTo), Not(EqualNullSafe), Not(In)).

Support-level consistency fixes

None. The 12 backing serdes were already clean.

Tracking issues filed for follow-up

None.

Audit process

Audited directly using the audit-comet-expression skill (4 Spark versions per #4468). Twelve serdes, no parallel subagents needed.

How are these changes tested?

  • make core succeeds (no code changes; doc only).
  • Existing predicate test coverage in CometExpressionSuite and the various SQL-file suites remains unchanged.

…4.0.1, 4.1.1

Add per-version audit sub-bullets to all 19 supported predicate SQL
function names (`!`, `<`, `<=`, `<=>`, `=`, `==`, `>`, `>=`,
`and`, `between`, `ilike`, `in`, `isnan`, `isnotnull`,
`isnull`, `like`, `not`, `or`, `rlike`) in
`docs/source/contributor-guide/spark_expressions_support.md`.

The Spark expression classes are byte-for-byte identical across the
four versions; only the `NullIntolerant` -> `nullIntolerant` trait
refactor lands in Spark 4.0, with no runtime change. `!` and `==` are
registry aliases for `Not` and `EqualTo`. `between` is rewritten by
the parser to `expr >= low AND expr <= high`. `ilike` is
`RuntimeReplaceable` and rewrites to `Like(Lower(left), Lower(right))`.
`like` and `rlike` cross-reference the existing string-expressions
audit (apache#4461).

No support-level consistency issues were found in the predicate serdes.
`CometNot` already optimizes a few special cases (`Not(EqualTo)`,
`Not(EqualNullSafe)`, `Not(In)`). No new tracking issues are filed.
@andygrove
Copy link
Copy Markdown
Member Author

No deferred follow-up work from this audit. The 12 backing serdes in spark/src/main/scala/org/apache/comet/serde/predicates.scala are simple one-line proto wrappers with no shape-restriction withInfo + return None patterns, no Spark 4.0+ collation gaps, no unreachable serde mappings, and no Incompatible markers needing an issue link. like and rlike follow-ups are owned by the string_funcs audit (#4461).

@andygrove andygrove merged commit dfde3dd into apache:main May 28, 2026
6 checks passed
@andygrove andygrove deleted the worktree-audit-predicate-funcs branch May 28, 2026 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants