feat(binding): use VectorViewClause zero-copy for Python dense vector query#517
Open
egolearner wants to merge 2 commits into
Open
feat(binding): use VectorViewClause zero-copy for Python dense vector query#517egolearner wants to merge 2 commits into
egolearner wants to merge 2 commits into
Conversation
bc74617 to
9c787e7
Compare
… query Python SDK: replace serialize_vector memcpy with VectorViewClause that points directly at the numpy buffer. py::keep_alive<1,3> on set_vector ensures the numpy array outlives the _SearchQuery object. get_vector and pickle __getstate__ now use get_vector_view() to handle both VectorClause (sparse / unpickled) and VectorViewClause (dense zero-copy). C SDK: add comments explaining why zero-copy is not used — the C API does not require the caller's data buffer to stay alive after the set_query_vector call returns.
9c787e7 to
84901f5
Compare
Cuiyus
reviewed
Jul 3, 2026
| [](const SearchQuery &sq) { | ||
| SubQuery sub; | ||
| sub.num_candidates_ = sq.topk_; | ||
| sub.target_ = sq.target_; |
Collaborator
There was a problem hiding this comment.
这里sq存的view_clause,在sub.target_ = sq.target_ 之后,view & numpy buffer 之间的keep alive 应该会失效吧? 这里如果py调用的时候,不小心释放了search_query,会导致view悬空?
| throw py::type_error("Unsupported vector field type for field: " + | ||
| field_schema.name()); | ||
| }, | ||
| py::keep_alive<1, 3>()) |
Collaborator
There was a problem hiding this comment.
得在doc-string里声明下,query 期间numpy array不能被修改?
| const auto buf = arr.request(); | ||
| switch (data_type) { | ||
| case DataType::VECTOR_FP32: { | ||
| self.target_.set_vector(serialize_vector<float>( |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Python SDK: replace serialize_vector memcpy with VectorViewClause that points directly at the numpy buffer. py::keep_alive<1,3> on set_vector ensures the numpy array outlives the _SearchQuery object. get_vector and pickle getstate now use get_vector_view() to handle both VectorClause (sparse / unpickled) and VectorViewClause (dense zero-copy).
C SDK: add comments explaining why zero-copy is not used — the C API does not require the caller's data buffer to stay alive after the set_query_vector call returns.