vec_distance_cosine returns NaN for zero-magnitude vectors

zero magnitude vectors are obviously bad. but yielding nans for them causes expanding nan-poisoning (e.g. corrupted KNN ordering because nans order weirdly).

not sure whether this is an intentional design decision, so thought i would ask.

some simple reproducers:

```sql
CREATE VIRTUAL TABLE t USING vec0(v float[3] distance_metric=cosine);

INSERT INTO t(rowid, v) VALUES
(1, '[1.0, 0.0, 0.0]'),
(2, '[0.0, 0.0, 0.0]'),
(3, '[0.9, 0.1, 0.0]');

SELECT vec_distance_cosine('[0,0,0]', '[1,2,3]');

SELECT rowid, distance FROM t
WHERE v MATCH '[1.0, 0.0, 0.0]'
ORDER BY distance
LIMIT 2;
```

i'd suggest returning NULL instead of dividing by zero; ditto for vec_normalize. and/or rejecting zero vectors at insert time when using cosine distance_metric.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vec_distance_cosine returns NaN for zero-magnitude vectors #8

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

vec_distance_cosine returns NaN for zero-magnitude vectors #8

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions