HIVE-29628: Incorrect objectName in PARTITION HivePrivilegeObject for…#6508
HIVE-29628: Incorrect objectName in PARTITION HivePrivilegeObject for…#6508rtrivedi12 wants to merge 2 commits into
Conversation
… view queries on partitioned tables
| for (Entity entity : allEntities) { | ||
| if (!(entity instanceof ReadEntity) || entity.getTyp() != Type.TABLE) { | ||
| continue; | ||
| } | ||
| ReadEntity tableEntity = (ReadEntity) entity; | ||
| if (tableEntity.isDirect() || tableEntity.getTable() == null) { | ||
| continue; | ||
| } | ||
| Table table = tableEntity.getTable(); | ||
| if (!partTable.getDbName().equals(table.getDbName()) | ||
| || !partTable.getTableName().equals(table.getTableName())) { | ||
| continue; | ||
| } | ||
| if (hasDeferredViewParent(tableEntity)) { | ||
| return false; | ||
| } | ||
| if (hasRegularViewParent(tableEntity)) { | ||
| return true; | ||
| } | ||
| } |
There was a problem hiding this comment.
I didn't quite understand this logic. Why do we need to check all the entites for a given partition object. This potentially lead to O(N^2) for huge partitioned table creating a bottleneck during compile phase (because authorization happens here)
There was a problem hiding this comment.
Also this whole method needs a refactor to simplify it as there are too many continues and ifs.
There was a problem hiding this comment.
Thanks @saihemanth-cloudera for the review ! I thought of using view name alias instead of table name for Partition ReadEntity to fix this issue. But this does not seem semantically correct as A PARTITION HivePrivilegeObject is built from a physical Partition on the base table. So, I think the proper fix should be skipping the partition entity sent for authorization for a regular view.
'type':PARTITION, 'dbName':datadb, 'objectType':PARTITION, 'objectName':t1, 'columns':[], 'partKeys':[], 'commandParams':[], 'actionType':OTHER, 'owner':hive}]
In this case, Partition Entity often has isDirect=true and empty parents, while the sibling indirect TABLE entity correctly has {viewdb.v1} hence sibling scan was needed. But I agree on the O(N^2), I have updated to a one-pass pre-scan: build a Set of base tables accessed via a regular view that removes the per-partition scan over allEntities
PARTITION datadb.t1@dept=a isDirect=true parents=[] ← empty, not t1
TABLE datadb.t1 isDirect=false parents=[viewdb.v1]
TABLE viewdb.v1 isDirect=true parents=[]
I didn't quite understand this logic. Why do we need to check all the entites for a given partition object. This potentially lead to O(N^2) for huge partitioned table creating a bottleneck during compile phase (because authorization happens here)
There was a problem hiding this comment.
Also this whole method needs a refactor to simplify it as there are too many
continues andifs.
Thanks @soumyakanti3578 ! I will update this.
| for (ReadEntity parent : parents) { | ||
| if (parent.getTyp() == Type.TABLE && parent.getTable() != null | ||
| && isDeferredAuthView(parent.getTable())) { | ||
| return true; | ||
| } | ||
| } |
There was a problem hiding this comment.
This can also lead to O(N^2). same with hasRegularViewParent() also.
| for (Entity entity : allEntities) { | ||
| if (!(entity instanceof ReadEntity) || entity.getTyp() != Type.TABLE) { | ||
| continue; | ||
| } | ||
| ReadEntity tableEntity = (ReadEntity) entity; | ||
| if (tableEntity.isDirect() || tableEntity.getTable() == null) { | ||
| continue; | ||
| } | ||
| Table table = tableEntity.getTable(); | ||
| if (!partTable.getDbName().equals(table.getDbName()) | ||
| || !partTable.getTableName().equals(table.getTableName())) { | ||
| continue; | ||
| } | ||
| if (hasDeferredViewParent(tableEntity)) { | ||
| return false; | ||
| } | ||
| if (hasRegularViewParent(tableEntity)) { | ||
| return true; | ||
| } | ||
| } |
There was a problem hiding this comment.
Also this whole method needs a refactor to simplify it as there are too many continues and ifs.
| && BASE_TABLE.equalsIgnoreCase(h.getObjectName()) | ||
| && DATA_DB.equalsIgnoreCase(h.getDbname()))); | ||
| } | ||
|
|
There was a problem hiding this comment.
I think we need tests for the sibling logic (Skip logic #2 from the PR description).
There was a problem hiding this comment.
This scenario is covered in added test testViewSelectNoBaseTablePartitionPrivObj()
|



… view queries on partitioned tables
What changes were proposed in this pull request?
Fixed incorrect
PARTITIONHivePrivilegeObjectgeneration inCommandAuthorizerV2for queries that access a regular (non-deferred) view over a partitioned base table.In
CommandAuthorizerV2.getHivePrivObjects(), skip emitting aHivePrivilegeObjectforPARTITION/DUMMYPARTITIONentities when access is through a regular view. The view’sTABLE_OR_VIEWobject already covers authorization.Skip logic (
isPartitionAccessedViaRegularView):TABLEentity for the same base table has a regular view parent → skip.Authorized=false) → emitPARTITIONon the base table (existing behaviour).PARTITIONon the base tableWhy are the changes needed?
HIVE-27892 added handling for
PARTITION/DUMMYPARTITIONentities inaddHivePrivObject, always using the physical base-table name (datadb/t1). For view queries, this causes authorizers to receive an extraPARTITIONobject on the base table in addition to the view’sTABLE_OR_VIEWobject. Users with view-only policies are getting PERMISSION denied.Does this PR introduce any user-facing change?
yes
View queries through regular views send only the view TABLE_OR_VIEW privilege object. Direct queries on partitioned tables and deferred-auth views still emit base-table PARTITION objects as before.
How was this patch tested?
mvn test -Dtest=TestViewPartitionPrivilegeObjects