Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,8 @@ Pass `decode_responses=True` to the `Redis` client if you want string keys inste
- [x] `LIMIT` and `OFFSET` pagination
- [x] Computed fields: `price * 0.9 AS discounted`
- [x] Vector KNN search: `vector_distance(field, :param)`
- [x] Hybrid search (filters + vector)
- [x] Pre-filter hybrid search (filters + vector KNN)
- [x] Hybrid fusion search: `hybrid_vector_search(...)` to `FT.HYBRID` (text + vector fused via RRF/LINEAR, requires Redis 8.4+)
- [x] Full-text search: exact phrase, fuzzy, proximity, OR/union, LIKE patterns, BM25 scoring
- [x] GEO field queries with full operator support
- [x] Date functions: `YEAR()`, `MONTH()`, `DAY()`, `DATE_FORMAT()`, etc.
Expand Down
83 changes: 81 additions & 2 deletions docs/user_guide/how_to_guides/vector-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,12 @@ result = executor.execute(

`vector_distance(field, :param)` is the function that triggers a KNN search. The `LIMIT` becomes the K value.

## Hybrid: filter then KNN
## Pre-filter hybrid search (filter then KNN)

Combine a `WHERE` clause with `vector_distance`:
Combine a `WHERE` clause with `vector_distance`. Here text and tags act only as a
hard filter and the ranking comes from the vector leg alone. For true text plus
vector fusion, where both legs are ranked independently and combined, see
[Hybrid fusion (FT.HYBRID)](#hybrid-fusion-fthybrid) below.

```python
result = executor.execute(
Expand All @@ -51,6 +54,82 @@ result = executor.execute(

The filter narrows the candidate set; the KNN runs over what survives.

## Hybrid fusion (FT.HYBRID)

`hybrid_vector_search()` fuses a full-text query and a vector query into a single
ranking server-side using Redis `FT.HYBRID` (Redis 8.4+, redis-py >= 7.1.0). Unlike
pre-filter hybrid search above, both legs are ranked independently and combined with
reciprocal rank fusion (RRF) or a linear weighting, so strong text matches and strong
vector matches both surface.

It composes the vector function (`cosine_distance` or `vector_distance`) and the text
function (`fulltext`), with `rrf()` or `linear()` selecting the fusion method:

```python
result = executor.execute(
"""
SELECT title,
hybrid_vector_search(
cosine_distance(embedding, :vec),
fulltext(title, 'gaming laptop'),
rrf()
) AS hybrid_score
FROM products
WHERE category = 'electronics'
ORDER BY hybrid_score DESC
LIMIT 5
""",
params={"vec": query_vec},
)
```

- The vector leg (`cosine_distance(field, :vec)`) and the text leg
(`fulltext(field, 'query')`) are ranked separately and then fused.
- A `WHERE` clause is applied to both legs as a filter.
- `AS hybrid_score` returns the fused score as a column; `ORDER BY hybrid_score DESC`
sorts by it.

### Fusion methods and knobs

`rrf()` (the default) uses reciprocal rank fusion; `linear()` uses a weighted sum
where `alpha` weights the text leg and `beta` is derived as `1 - alpha`:

```python
# RRF with explicit knobs
hybrid_vector_search(
cosine_distance(embedding, :vec),
fulltext(title, 'laptop'),
rrf(constant => 60, window => 20)
)

# LINEAR weighting
hybrid_vector_search(
cosine_distance(embedding, :vec),
fulltext(title, 'laptop'),
linear(alpha => 0.3)
)
```

A custom text scorer can be set on the text leg
(`fulltext(title, 'laptop', scorer => 'BM25STD')`). Vector-leg tuning rides on the
`vector_distance` / `vector_range` forms rather than `cosine_distance`:

```python
# KNN exploration factor
hybrid_vector_search(
vector_distance(embedding, :vec, ef_runtime => 20),
fulltext(title, 'laptop'),
rrf()
)

# Vector range instead of KNN
hybrid_vector_search(
vector_range(embedding, :vec, radius => 0.2),
fulltext(title, 'laptop'),
rrf()
)
```

## Returning the score

`vector_distance(...) AS alias` is required for the score to come back as a column. The result rows include the alias as a key.
Loading