Skip to content

feat: add informational message channel distinct from fallback reasons#4509

Draft
andygrove wants to merge 4 commits into
apache:mainfrom
andygrove:info-message-channel
Draft

feat: add informational message channel distinct from fallback reasons#4509
andygrove wants to merge 4 commits into
apache:mainfrom
andygrove:info-message-channel

Conversation

@andygrove
Copy link
Copy Markdown
Member

Which issue does this PR close?

Closes #4006.

Depends on and is stacked on #4508 (the withInfo -> withFallbackReason rename). Because the two branches live on a fork, this PR targets main and therefore currently includes #4508's rename commit in its diff. Please review #4508 first; once it merges, rebase will reduce this PR to just the feature commits below.

Rationale for this change

Comet only had one way to tag a plan node with a message, and that message always meant "this node falls back to Spark". There was no way to attach a purely informational note that does not trigger fallback. This is increasingly useful with codegen dispatch: when Comet runs a JVM implementation of an expression even though a faster native implementation exists behind a config, we want to tell the user about the faster path without that note being treated as a fallback.

What changes are included in this PR?

  • A new informational channel, parallel to the fallback channel freed up by refactor: rename withInfo to withFallbackReason for clarity #4508:
    • CometSparkSessionExtensions.withInfo(node, message) records a message on a new CometExplainInfo.EXTENSION_INFO tag. It does not cause fallback: no planning rule reads this tag.
    • Verbose extended explain renders these as a distinct [COMET-INFO: ...] segment, in addition to any [COMET: ...] fallback segment on the same node. The fallback explain list format is unchanged and still excludes info messages.
  • Expression-level info messages are lifted onto the converted operator node in CometExecRule.convertToComet (a single central rollup, applied to all native operators), because verbose explain only traverses plan nodes, not expressions.
  • First consumer: CometDateFormat emits a [COMET-INFO: ...] hint when a natively-supported format is requested but native execution is gated off (non-UTC session timezone with allowIncompatible disabled), so Comet runs the JVM codegen path. The hint names the exact config key to enable the faster native path.

Known limitation for future work: the Spark 4.x CometExprShim node reconstruction copies FALLBACK_REASONS but not EXTENSION_INFO onto the wrapping Invoke. No current code path routes withInfo through those shims, so this is latent. It can be addressed if a future serde tags one of those reconstructed nodes.

How are these changes tested?

New tests in CometExpressionSuite:

  • withInfo does not set a fallback reason and renders as [COMET-INFO: ...] in verbose explain, and a second message accumulates rather than overwriting.
  • date_format takes the JVM codegen path under a non-UTC timezone and surfaces the [COMET-INFO: ...] hint naming the DateFormatClass.allowIncompatible config key.

The full CometExpressionSuite passes (125 succeeded), confirming the central convertToComet rollup does not regress operator conversion. scalastyle:check passes.

andygrove added 3 commits May 28, 2026 18:22
Rename withInfo/withInfos/hasExplainInfo and EXTENSION_INFO to
withFallbackReason/withFallbackReasons/hasFallbackReason and
FALLBACK_REASONS to match their actual semantics (fallback reasons,
not generic info). Also rename the private extensionInfo helper in
ExtendedExplainInfo to fallbackReasons, and update the TreeNodeTag
string from "CometExtensionInfo" to "CometFallbackReasons" so a
future PR can reuse the old string for a distinct tag.
…skip ci]

When date_format gets a natively-supported format string but the session
timezone is non-UTC and allowIncompatible is off, Comet takes the JVM
codegen path. Emit a COMET-INFO hint on the expression and lift
expression-level info messages onto the converted operator centrally in
CometExecRule, so verbose extended explain shows the faster native option
and how to enable it.
@andygrove andygrove marked this pull request as draft May 29, 2026 03:33
# Conflicts:
#	spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala
#	spark/src/main/scala/org/apache/comet/ExtendedExplainInfo.scala
#	spark/src/main/scala/org/apache/comet/serde/contraintExpressions.scala
#	spark/src/main/scala/org/apache/comet/serde/datetime.scala
#	spark/src/main/scala/org/apache/comet/serde/math.scala
#	spark/src/main/scala/org/apache/comet/serde/statics.scala
#	spark/src/main/scala/org/apache/comet/serde/strings.scala
#	spark/src/main/scala/org/apache/comet/serde/structs.scala
#	spark/src/main/scala/org/apache/comet/serde/unixtime.scala
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add distinction between "info" and "fallback" messages

1 participant