Skip to content

feat(binding): use VectorViewClause zero-copy for Python dense vector query#517

Open
egolearner wants to merge 2 commits into
alibaba:mainfrom
egolearner:feat/sdk-zero-copy-vector
Open

feat(binding): use VectorViewClause zero-copy for Python dense vector query#517
egolearner wants to merge 2 commits into
alibaba:mainfrom
egolearner:feat/sdk-zero-copy-vector

Conversation

@egolearner

Copy link
Copy Markdown
Collaborator

Python SDK: replace serialize_vector memcpy with VectorViewClause that points directly at the numpy buffer. py::keep_alive<1,3> on set_vector ensures the numpy array outlives the _SearchQuery object. get_vector and pickle getstate now use get_vector_view() to handle both VectorClause (sparse / unpickled) and VectorViewClause (dense zero-copy).

C SDK: add comments explaining why zero-copy is not used — the C API does not require the caller's data buffer to stay alive after the set_query_vector call returns.

@egolearner egolearner requested a review from Cuiyus as a code owner June 23, 2026 07:30
@egolearner egolearner force-pushed the feat/sdk-zero-copy-vector branch from bc74617 to 9c787e7 Compare June 23, 2026 07:31
… query

Python SDK: replace serialize_vector memcpy with VectorViewClause that
points directly at the numpy buffer.  py::keep_alive<1,3> on set_vector
ensures the numpy array outlives the _SearchQuery object.  get_vector
and pickle __getstate__ now use get_vector_view() to handle both
VectorClause (sparse / unpickled) and VectorViewClause (dense zero-copy).

C SDK: add comments explaining why zero-copy is not used — the C API
does not require the caller's data buffer to stay alive after the
set_query_vector call returns.
[](const SearchQuery &sq) {
SubQuery sub;
sub.num_candidates_ = sq.topk_;
sub.target_ = sq.target_;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里sq存的view_clause,在sub.target_ = sq.target_ 之后,view & numpy buffer 之间的keep alive 应该会失效吧? 这里如果py调用的时候,不小心释放了search_query,会导致view悬空?

throw py::type_error("Unsupported vector field type for field: " +
field_schema.name());
},
py::keep_alive<1, 3>())

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

得在doc-string里声明下,query 期间numpy array不能被修改?

const auto buf = arr.request();
switch (data_type) {
case DataType::VECTOR_FP32: {
self.target_.set_vector(serialize_vector<float>(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serialize_vector 没用可以删了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants