replace fan-out JOINs with EXISTS subquery for MapAttribute RSQL filt…#3088
Open
vasilchev wants to merge 1 commit into
Open
replace fan-out JOINs with EXISTS subquery for MapAttribute RSQL filt…#3088vasilchev wants to merge 1 commit into
vasilchev wants to merge 1 commit into
Conversation
…ering
Problem
RSQL queries filtering targets by map attributes (controller attributes, software module metadata) with multiple AND conditions produce catastrophically large intermediate result sets that cause queries to hang indefinitely in production.
Reported query (attribute.sw_1_version=out=(...) and attribute.sw_2_version=out=(...) and attribute.sw_3_version=out=(...)) generated this SQL:
SELECT DISTINCT COUNT(DISTINCT(t0.id))
FROM sp_target t0
LEFT OUTER JOIN sp_target_attributes t2 ON (t2.target = t0.id)
LEFT OUTER JOIN sp_target_attributes t4 ON (t4.target = t0.id)
LEFT OUTER JOIN sp_target_attributes t6 ON (t6.target = t0.id)
, sp_target_attributes t5
, sp_target_attributes t3
, sp_target_attributes t1
WHERE (t5.attribute_value IS NULL OR t5.attribute_value NOT IN (...))
AND (t3.attribute_value IS NULL OR t3.attribute_value NOT IN (...))
AND (t1.attribute_value IS NULL OR t1.attribute_value NOT IN (...))
AND t5.target = t0.id AND t5.attribute_key = 'sw_1_version'
AND t3.target = t0.id AND t3.attribute_key = 'sw_2_version'
AND t1.target = t0.id AND t1.attribute_key = 'sw_3_version'
This affects any map-typed RSQL field: attribute.*, metadata.*, or any custom @ElementCollection Map<String,String> field.
Replace all MapAttribute non-null operator handling with a correlated EXISTS subquery:
The compare() method handles all operators:
- EQ / IN / LIKE / GT / GTE / LT / LTE → positive predicate inside EXISTS
- NE / NOT_IN / NOT_LIKE → IS NULL OR <negated predicate> inside EXISTS
Semantics preserved: INNER JOIN inside the subquery means targets without the attribute key are excluded — identical to the previous behaviour.
Unaffected: =is=null / =not=null checks (handled by a separate code path using getJoinOn() with LEFT JOIN), SetAttribute (tag, etc.), singular fields, entity references.
---
SQL comparison
Single condition attribute.key!=value
-- BEFORE
SELECT DISTINCT t.* FROM target t
LEFT JOIN attrs ghost ON ghost.target = t.id
JOIN attrs a ON a.target = t.id AND a.key = ?
WHERE a.value IS NULL OR a.value <> ?
-- AFTER
SELECT DISTINCT t.* FROM target t
WHERE EXISTS (
SELECT 1 FROM attrs a
WHERE a.target = t.id AND a.key = ?
AND (a.value IS NULL OR a.value <> ?)
)
Three AND conditions attribute.k1=out=(...) and attribute.k2=out=(...) and attribute.k3=out=(...)
-- BEFORE (Hibernate; even worse on EclipseLink which produces comma cross-joins)
SELECT DISTINCT t.* FROM target t
LEFT JOIN attrs g1 ON g1.target = t.id
JOIN attrs a1 ON a1.target = t.id AND a1.key = ?
LEFT JOIN attrs g2 ON g2.target = t.id
JOIN attrs a2 ON a2.target = t.id AND a2.key = ?
LEFT JOIN attrs g3 ON g3.target = t.id
JOIN attrs a3 ON a3.target = t.id AND a3.key = ?
WHERE (a1.value IS NULL OR a1.value NOT IN (...))
AND (a2.value IS NULL OR a2.value NOT IN (...))
AND (a3.value IS NULL OR a3.value NOT IN (...))
-- Intermediate rows per target: 100 (g1) × 100 (g2) × 100 (g3) = 1,000,000
-- AFTER
SELECT DISTINCT t.* FROM target t
WHERE EXISTS (SELECT 1 FROM attrs WHERE target=t.id AND key=? AND value NOT IN (...))
AND EXISTS (SELECT 1 FROM attrs WHERE target=t.id AND key=? AND value NOT IN (...))
AND EXISTS (SELECT 1 FROM attrs WHERE target=t.id AND key=? AND value NOT IN (...))
-- Each EXISTS: 1 PK index lookup. Total: O(N_targets × 3)
Three AND conditions attribute.k1==v1 and attribute.k2==v2 and attribute.k3==v3 (positive operators)
-- BEFORE
SELECT DISTINCT t.* FROM target t
LEFT JOIN attrs a1 ON a1.target = t.id
LEFT JOIN attrs a2 ON a2.target = t.id
LEFT JOIN attrs a3 ON a3.target = t.id
WHERE a1.key=? AND a1.value=?
AND a2.key=? AND a2.value=?
AND a3.key=? AND a3.value=?
-- Intermediate rows per target: 100^3 = 1,000,000
-- AFTER
SELECT DISTINCT t.* FROM target t
WHERE EXISTS (SELECT 1 FROM attrs WHERE target=t.id AND key=? AND value=?)
AND EXISTS (SELECT 1 FROM attrs WHERE target=t.id AND key=? AND value=?)
AND EXISTS (SELECT 1 FROM attrs WHERE target=t.id AND key=? AND value=?)
-- 3 PK index lookups
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
RSQL queries filtering targets by map attributes (controller attributes, software module metadata) with multiple AND conditions produce catastrophically large intermediate result sets that cause queries to hang indefinitely in production.
Reported query (attribute.sw_1_version=out=(...) and attribute.sw_2_version=out=(...) and attribute.sw_3_version=out=(...)) generated this SQL:
This affects any map-typed RSQL field: attribute., metadata., or any custom @ElementCollection Map<String,String> field.
Replace all MapAttribute non-null operator handling with a correlated EXISTS subquery:
The compare() method handles all operators:
Semantics preserved: INNER JOIN inside the subquery means targets without the attribute key are excluded — identical to the previous behaviour.
Unaffected: =is=null / =not=null checks (handled by a separate code path using getJoinOn() with LEFT JOIN), SetAttribute (tag, etc.), singular fields, entity references.
SQL comparison
Single/direct attribute.key!=value
-- BEFORE
-- AFTER
Multiple different key AND conditions attribute.k1=out=(...) and attribute.k2=out=(...) and attribute.k3=out=(...)
-- BEFORE (Hibernate; even worse on EclipseLink which produces comma cross-joins)
-- Intermediate rows per target: 100 (g1) × 100 (g2) × 100 (g3) = 1,000,000
-- AFTER
-- Each EXISTS: 1 PK index lookup. Total: O(N_targets × 3)
Multiple Different AND conditions attribute.k1==v1 and attribute.k2==v2 and attribute.k3==v3 (positive operators)
-- BEFORE
-- Intermediate rows per target: 100^3 = 1,000,000
-- AFTER
-- 3 PK index lookups