Commit ea25686
refactor(spark): rename function params to match pyspark
Align positional parameter names in `functions.spark` with pyspark.sql.functions:
- aggregate first positional → `col` (avg, try_sum, collect_list, collect_set)
- unary `arg` → `col` across math/string/byte/datetime helpers
- multi-arg renames: array_contains (col, value), array (*cols), shuffle (col),
array_repeat (col, count), slice (x, start, length), shiftleft/right/rightunsigned
(col, numBits), add_months (start, months), date_add/sub (start, days),
date_diff (end, start), date_trunc (format, timestamp), time_trunc (unit, time),
trunc (date, format), next_day (date, dayOfWeek), from/to_utc_timestamp
(timestamp, tz), sha2 (col, numBits), xxhash64 (*cols), map_from_arrays
(col1, col2), width_bucket (v, min, max, numBucket), substring (str, pos, len),
concat (*cols), elt (*inputs), is_valid_utf8/make_valid_utf8 (str)
Bodies updated to reference the new names; positional callers unaffected.
This finishes Category 1 / Category 4 (spark-side BOTH-bucket) renames from
PYSPARK_ALIGNMENT_PLAN.md PR 1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent f8f9d7a commit ea25686
1 file changed
Lines changed: 137 additions & 136 deletions
0 commit comments