You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Comet has no support for Spark's interval data types:
CalendarIntervalType (months + days + microseconds)
YearMonthIntervalType (ANSI INTERVAL YEAR TO MONTH)
DayTimeIntervalType (ANSI INTERVAL DAY TO SECOND)
Because the types are unsupported, every expression that produces or consumes an interval falls back to Spark, and any query carrying an interval column through a Comet operator falls back as well. CometBatchKernelCodegen.isSupportedDataType also rejects these types, so they cannot even be routed through the JVM codegen dispatcher (see #4506 / #4538): the interval expressions are a genuine arrow-native gap with no stopgap.
This issue tracks the foundational type support plus the dependent expression family. It is the prerequisite for the already-filed per-expression requests below.
Wire the types through the CometVector hierarchy, FFI import/export (NativeUtil / scan.rs), and serializeDataType in QueryPlanSerde.
Allow these types in CometBatchKernelCodegen.isSupportedDataType once the FFI path is correct, so codegen dispatch can also cover interval expressions.
Expressions (depend on the type work)
Constructors and arithmetic already tracked individually:
What is the problem the feature request solves?
Comet has no support for Spark's interval data types:
CalendarIntervalType(months + days + microseconds)YearMonthIntervalType(ANSIINTERVAL YEAR TO MONTH)DayTimeIntervalType(ANSIINTERVAL DAY TO SECOND)Because the types are unsupported, every expression that produces or consumes an interval falls back to Spark, and any query carrying an interval column through a Comet operator falls back as well.
CometBatchKernelCodegen.isSupportedDataTypealso rejects these types, so they cannot even be routed through the JVM codegen dispatcher (see #4506 / #4538): the interval expressions are a genuine arrow-native gap with no stopgap.This issue tracks the foundational type support plus the dependent expression family. It is the prerequisite for the already-filed per-expression requests below.
Describe the potential solution
Type support (prerequisite)
YearMonthIntervalType-> ArrowInterval(YearMonth)DayTimeIntervalType-> ArrowInterval(MonthDayNano)/Duration(decide representation that round-trips with Spark's microsecond storage)CalendarIntervalType-> ArrowInterval(MonthDayNano)(Spark stores months/days/micros)CometVectorhierarchy, FFI import/export (NativeUtil/scan.rs), andserializeDataTypeinQueryPlanSerde.CometBatchKernelCodegen.isSupportedDataTypeonce the FFI path is correct, so codegen dispatch can also cover interval expressions.Expressions (depend on the type work)
Constructors and arithmetic already tracked individually:
make_interval([Feature] Support Spark expression: make_interval #3099),make_dt_interval([Feature] Support Spark expression: make_dt_interval #3098),make_ym_interval([Feature] Support Spark expression: make_ym_interval #3100),try_make_interval([Feature] Support Spark expression: try_make_interval #3103)multiply_ym_interval([Feature] Support Spark expression: multiply_ym_interval #3102),multiply_dt_interval([Feature] Support Spark expression: multiply_dt_interval #3101),divide_dt_interval([Feature] Support Spark expression: divide_dt_interval #3096)date_add_interval([Feature] Support Spark expression: date_add_interval #3086),timestamp_add_interval([Feature] Support Spark expression: timestamp_add_interval #3114),timestamp_add_ym_interval([Feature] Support Spark expression: timestamp_add_ym_interval #3115),time_add_interval([Feature] Support Spark expression: time_add_interval #3121)subtract_timestamps([Feature] Support Spark expression: subtract_timestamps #3112),subtract_dates([Feature] Support Spark expression: subtract_dates #3094),subtract_times([Feature] Support Spark expression: subtract_times #3139),datetime_sub([Feature] Support Spark expression: datetime_sub #3134),datetime_addextract/date_partof interval fields(The list of per-expression issues is derived from the
// datetime functionssection ofFunctionRegistry; this umbrella should be linked from each.)Additional context