Skip to content

refactor(context): deduplicate register/read option-building logic#1479

Open
mesejo wants to merge 2 commits intoapache:mainfrom
mesejo:refactor/context-register-read
Open

refactor(context): deduplicate register/read option-building logic#1479
mesejo wants to merge 2 commits intoapache:mainfrom
mesejo:refactor/context-register-read

Conversation

@mesejo
Copy link
Copy Markdown
Contributor

@mesejo mesejo commented Apr 5, 2026

Which issue does this PR close?

N/A

Rationale for this change

Reduce the number of execution paths for register and read functions

What changes are included in this PR?

Extract shared helpers (convert_partition_cols, convert_file_sort_order, build_parquet/json/avro_options, convert_csv_options), standardize path types to &str, and remove redundant intermediate variables.

Are there any user-facing changes?

No

LLM-generated code disclosure

This PR includes code generated with assistance from an LLM. All LLM-generated content has been manually reviewed and tested.

@mesejo mesejo force-pushed the refactor/context-register-read branch from fe84781 to 39e190f Compare April 5, 2026 16:17
@mesejo mesejo marked this pull request as ready for review April 5, 2026 16:38
@timsaucer timsaucer requested a review from Copilot April 8, 2026 17:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors PySessionContext read/register APIs to reduce duplicated option-building logic across Parquet/CSV/JSON/Avro operations.

Changes:

  • Extracted shared helper functions to convert partition columns / sort order and to build Parquet/JSON/Avro/Csv read options.
  • Simplified read/register implementations by removing intermediate variables and consolidating async wait patterns.
  • Standardized some path parameters from PathBuf to &str for JSON/Avro APIs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind merging main and giving register_arrow and read_arrow the similar treatment?

@timsaucer
Copy link
Copy Markdown
Member

Nice cleanup!

Extract shared helpers (convert_partition_cols, convert_file_sort_order,
build_parquet/json/avro_options, convert_csv_options), standardize path
types to &str, and remove redundant intermediate variables.
@mesejo mesejo force-pushed the refactor/context-register-read branch 2 times, most recently from 74c6fe7 to 9eddc10 Compare April 11, 2026 08:29
… methods

Change path parameters from &str to PathBuf in all register/read methods
(register_listing_table, register_parquet, register_json, register_avro,
register_arrow, read_json, read_parquet, read_avro, read_arrow) so callers
can pass either a Python str or a pathlib.Path object. For register_csv and
read_csv, which take &Bound<PyAny> to handle lists, extract path elements as
PathBuf rather than String for the same reason.

Add a path_to_str helper that converts PathBuf to &str, returning an explicit
error for non-UTF-8 paths rather than silently corrupting them.

Add build_arrow_options helper to deduplicate register_arrow/read_arrow
option-building logic, consistent with the existing parquet/json/avro helpers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@mesejo mesejo force-pushed the refactor/context-register-read branch from 9eddc10 to f536873 Compare April 11, 2026 08:34
@mesejo
Copy link
Copy Markdown
Contributor Author

mesejo commented Apr 11, 2026

Would you mind merging main and giving register_arrow and read_arrow the similar treatment?

Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants