[python] Support query auth (row filter & column masking) for REST catalog#8136
Open
MgjLLL wants to merge 1 commit into
Open
[python] Support query auth (row filter & column masking) for REST catalog#8136MgjLLL wants to merge 1 commit into
MgjLLL wants to merge 1 commit into
Conversation
…talog
Adds query-auth support to the Python client so it honors the row-level
filter and column masking rules returned by a REST catalog, matching the
existing JVM client behavior.
When the new option `query-auth.enabled` is set to true, the client
calls `POST /v1/.../databases/{db}/tables/{tb}/auth` before producing a
plan, receives `{ filter, columnMasking }`, and applies them on the
read path:
* `predicate_json_parser` parses Paimon predicate JSON into a
PyArrow compute filter (EQ/NEQ/LT/LTEQ/GT/GTEQ/IS_NULL/IS_NOT_NULL/
IN/NOT_IN/STARTS_WITH/ENDS_WITH/CONTAINS/AND/OR/NOT).
* `AuthFilterReader` / `AuthMaskingReader` / `ColumnProjectReader`
perform row filtering, column masking transforms (NULL, FIELD_REF,
CAST, UPPER, LOWER, CONCAT, CONCAT_WS) and final projection back to
the user's requested columns.
* `TableQueryAuth` / `TableQueryAuthResult` wrap the result and
convert each split to a `QueryAuthSplit`.
Behavior is gated by `CoreOptions.QUERY_AUTH_ENABLED` (default false),
so existing users see no change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Adds query-auth support to the Python client so it honors the row-level filter and column masking rules returned by a REST catalog, matching the existing JVM client behavior.
When the new option
query-auth.enabledis set totrue, before producing aPlanthe client callsPOST /v1/.../databases/{db}/tables/{tb}/authwith the projected fields, receives{ filter, columnMasking }, and applies them on the read path:RESTApi.auth_table_queryissues the call (new request/response modelsAuthTableQueryRequest/AuthTableQueryResponse, new path inResourcePaths.auth_table).TableQueryAuth/TableQueryAuthResult(catalog/table_query_auth.py) wrap the result and convert each split to aQueryAuthSplit.predicate_json_parser(common/predicate_json_parser.py) parses Paimon predicate JSON into a PyArrow compute filter (EQ/NEQ/LT/LTEQ/GT/GTEQ/IS_NULL/IS_NOT_NULL/IN/NOT_IN/STARTS_WITH/ENDS_WITH/CONTAINS/AND/OR/NOT).AuthFilterReader/AuthMaskingReader/ColumnProjectReader(read/reader/auth_masking_reader.py) implement row filtering, column masking transforms (NULL,FIELD_REF,CAST,UPPER,LOWER,CONCAT,CONCAT_WS) and final projection back to the user's requested columns.read_builder/stream_read_builder/table_read/table_scan/file_store_table/catalog_environment/rest_catalogare wired to invoke the auth call and pull extra fields required only by the auth filter.Behavior is gated by the new
CoreOptions.QUERY_AUTH_ENABLED(query-auth.enabled, defaultfalse), so existing users see no change.Tests
Three new test files (994+ lines, all passing locally under
pytest):paimon-python/pypaimon/tests/predicate_json_parser_test.py— covers each predicate kind, nested AND/OR/NOT, type coercion, null handling, andextract_referenced_fields.paimon-python/pypaimon/tests/auth_masking_reader_test.py— covers each masking transform, missing-field validation, and projection back to the user-requested columns.paimon-python/pypaimon/tests/table_query_auth_test.py— end-to-end coverage: REST catalog callsauth_table_query, the result is plumbed into the plan, splits becomeQueryAuthSplit, and reads return filtered + masked rows.Local check:
API and Format
query-auth.enabled(boolean, defaultfalse).POST /v1/{prefix}/databases/{db}/tables/{tb}/auth. Request{ "select": [...] }, response{ "filter": [<predicate-json>...], "columnMasking": { <col>: <transform-json>, ... } }. The contract follows the existing Java client; no server-side change is required for catalogs that already implement query auth.AuthTableQueryRequest,AuthTableQueryResponse,TableQueryAuth,TableQueryAuthResult,QueryAuthSplit,AuthFilterReader,AuthMaskingReader,ColumnProjectReader) are additive and live under existing modules.Documentation
The new option
query-auth.enabledshould be reflected in the Python configuration reference. Happy to add the docs entry in this PR or in a follow-up — please advise.This closes #8135