Skip to content

fix: add iceberg_type column and filter all SqlCatalog table operations#3525

Open
GayathriSrividya wants to merge 1 commit into
apache:mainfrom
GayathriSrividya:fix/issue-3337-sql-catalog-iceberg-type-filter
Open

fix: add iceberg_type column and filter all SqlCatalog table operations#3525
GayathriSrividya wants to merge 1 commit into
apache:mainfrom
GayathriSrividya:fix/issue-3337-sql-catalog-iceberg-type-filter

Conversation

@GayathriSrividya

Copy link
Copy Markdown
Contributor

closes #3337

Summary

Adds the iceberg_type column to the IcebergTables ORM model (also superseding #3263) and extends the type filter to all table operations, matching Java's JdbcCatalog behavior.

Root cause

PR #3263 proposed adding iceberg_type to list_tables, but was not yet merged. Meanwhile #3337 identified that even after that fix, load_table, drop_table, rename_table, and commit_table would still lack the filter. View rows written by iceberg-java or iceberg-rust can therefore bleed into Python's table operations.

Java's JdbcCatalog applies WHERE (iceberg_type = 'TABLE' OR iceberg_type IS NULL) to every table lookup (JdbcUtil.java#L168).

Changes

pyiceberg/catalog/sql.py

  • Add iceberg_type: Mapped[str | None] column to IcebergTables model.
  • Add ICEBERG_TABLE_TYPE = "TABLE" constant.
  • Add idempotent ALTER TABLE iceberg_tables ADD COLUMN iceberg_type migration in _ensure_tables_exist (backward-compatible; catches OperationalError/ProgrammingError if column already exists).
  • Set iceberg_type=ICEBERG_TABLE_TYPE on all IcebergTables inserts: create_table, register_table, commit_table (new table path).
  • Apply WHERE (iceberg_type = 'TABLE' OR iceberg_type IS NULL) filter to: load_table, drop_table, rename_table, commit_table, list_tables.

tests/catalog/test_sql.py

  • Add TestIcebergTypeFilter class with 5 regression tests:
    • test_iceberg_type_set_on_create — verifies new tables get iceberg_type='TABLE'
    • test_list_tables_excludes_view_rows
    • test_load_table_ignores_view_rows
    • test_drop_table_ignores_view_rows
    • test_rename_table_ignores_view_rows

Validation

  • make lint
  • pytest tests/catalog/test_sql.py → 17 passed ✓

Comment thread pyiceberg/catalog/sql.py
# the error if the column is already present.
with Session(self.engine) as session:
try:
session.execute(text("ALTER TABLE iceberg_tables ADD COLUMN iceberg_type VARCHAR(255)"))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should add a new column in _ensure_tables_exist method. It is counterintuitive. The initialization logic in #3263 looks better than this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SqlCatalog table operations should filter on iceberg_type

2 participants