feat(search): cache never-match queries via the predicate cache by PSeitz · Pull Request #6556 · quickwit-oss/quickwit

PSeitz · 2026-06-25T16:11:47Z

Repeated super-selective or non-existent term queries over many splits re-ran warmup on every request.

When warmup proves a split empty (a required term has an empty posting list), record a fake empty entry in the existing predicate cache, and consult it before warmup so later requests with the same predicate short-circuit with no storage reads.

Repeated super-selective or non-existent term queries over many splits re-ran warmup on every request. When warmup proves a split empty (a required term has an empty posting list), record a fake empty entry in the existing predicate cache, and consult it before warmup so later requests with the same predicate short-circuit with no storage reads. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

trinity-1686a · 2026-06-26T09:35:45Z

+fn negative_cache_key(query_ast: &QueryAst) -> Option<String> {
+    let inner = match query_ast {
+        QueryAst::Cache(cache_node) => cache_node.inner.as_ref(),
+        other => other,
+    };
+    serde_json::to_string(inner).ok()
+}


i think it would be interesting for this to also extract Must/Filter arms in BoolQuery (and possibly lonely Should), if any "must" part of the query doesn't match, the whole query necessarily doesn't match

In it's current form it doesn't work as I want it yet. It doesn't match non-existent terms if they are part of a longer chain, e.g. "fielda:doesnotexist" and then "fielda:doesnotexist fieldb:asdf"

The negative cache keyed each provably-empty split on the whole query AST, so any added/removed filter or different time window produced a new key. In production this gave near-zero reuse: every filter permutation re-opened and re-probed all splits. A required term's absence in a split is immutable and independent of the rest of the query and of the time window. Key the cache on (split, term) instead: warmup reports each required term it proves absent via an `on_absent` callback, and a query short-circuits before warmup when any of its required terms is already known absent. Adding required terms can only make a query emptier, so cached absences keep pruning as filters change.

PSeitz requested a review from a team as a code owner June 25, 2026 16:11

trinity-1686a reviewed Jun 26, 2026

View reviewed changes

PSeitz-dd force-pushed the cache_terms branch from c91551d to 4c8945f Compare June 26, 2026 18:55

PSeitz-dd force-pushed the cache_terms branch from 4c8945f to 55c442a Compare June 26, 2026 19:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(search): cache never-match queries via the predicate cache#6556

feat(search): cache never-match queries via the predicate cache#6556
PSeitz wants to merge 2 commits into
mainfrom
cache_terms

PSeitz commented Jun 25, 2026

Uh oh!

trinity-1686a Jun 26, 2026

Uh oh!

PSeitz Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

PSeitz commented Jun 25, 2026

Uh oh!

trinity-1686a Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

PSeitz Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants