Skip to content

feat(search): cache never-match queries via the predicate cache#6556

Open
PSeitz wants to merge 2 commits into
mainfrom
cache_terms
Open

feat(search): cache never-match queries via the predicate cache#6556
PSeitz wants to merge 2 commits into
mainfrom
cache_terms

Conversation

@PSeitz

@PSeitz PSeitz commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Repeated super-selective or non-existent term queries over many splits re-ran warmup on every request.

When warmup proves a split empty (a required term has an empty posting list), record a fake empty entry in the existing predicate cache, and consult it before warmup so later requests with the same predicate short-circuit with no storage reads.

Repeated super-selective or non-existent term queries over many splits
re-ran warmup on every request.

When warmup proves a split empty (a required term has an empty posting list),
record a fake empty entry in the existing predicate cache, and consult it before warmup so later
requests with the same predicate short-circuit with no storage reads.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@PSeitz PSeitz requested a review from a team as a code owner June 25, 2026 16:11
Comment thread quickwit/quickwit-search/src/leaf.rs Outdated
Comment on lines +634 to +640
fn negative_cache_key(query_ast: &QueryAst) -> Option<String> {
let inner = match query_ast {
QueryAst::Cache(cache_node) => cache_node.inner.as_ref(),
other => other,
};
serde_json::to_string(inner).ok()
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would be interesting for this to also extract Must/Filter arms in BoolQuery (and possibly lonely Should), if any "must" part of the query doesn't match, the whole query necessarily doesn't match

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In it's current form it doesn't work as I want it yet. It doesn't match non-existent terms if they are part of a longer chain, e.g. "fielda:doesnotexist" and then "fielda:doesnotexist fieldb:asdf"

The negative cache keyed each provably-empty split on the whole query AST,
so any added/removed filter or different time window produced a new key. In
production this gave near-zero reuse: every filter permutation re-opened and
re-probed all splits.

A required term's absence in a split is immutable and independent of the
rest of the query and of the time window. Key the cache on (split, term)
instead: warmup reports each required term it proves absent via an
`on_absent` callback, and a query short-circuits before warmup when any of
its required terms is already known absent. Adding required terms can only
make a query emptier, so cached absences keep pruning as filters change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants