Add leanvec_primary_only build option to C++ runtime#323
Conversation
Adds a new `leanvec_primary_only` parameter (default false) to the
LeanVec build entry points in the C++ runtime API:
- VamanaIndexLeanVec::build (both overloads)
- DynamicVamanaIndexLeanVec::build (DynamicIndexParams overloads)
The flag is stored in {Vamana,DynamicVamana}IndexLeanVecImpl and
forwarded through init_impl -> build_impl -> StorageFactory<LeanVec>::init
-> LeanDataset::reduce, where it skips secondary-dataset allocation
and disables reranking. ABI back-compat overloads are unchanged.
Compiles standalone with SVS_RUNTIME_HAVE_LVQ_LEANVEC=OFF (stub returns
NOT_IMPLEMENTED). When enabled, requires the matching LeanDataset::reduce
overload that accepts the primary_only argument (added in the private
repository alongside the LeanDataset save/load support).
|
Is there a corresponding FAISS update? |
Yes, I will publish it as well once we have this PR merged |
|
I'd recommend drafting a FAISS update and pointing the FAISS install in this PR to your FAISS branch, otherwise our CI will be failing |
Makes sense, I will add the corresponding FAISS update |
ethanglaser
left a comment
There was a problem hiding this comment.
These changes to test-cpp-runtime-bindings.sh should not be needed - please restore based on suggestions and let's see if CI runs smoothly
| # Install libsvs-runtime from the public release tarball (includes the | ||
| # leanvec_primary_only API). Once a conda-forge package with this API is | ||
| # published, this can revert to: | ||
| # conda install -y /runtime_conda/libsvs-runtime-*.conda | ||
| SVS_RUNTIME_URL="${SVS_RUNTIME_URL:-https://github.com/intel/ScalableVectorSearch/releases/download/nightly/svs_runtime-0.3.0-linux-x86_64-leanvec-primary-only-glibc228.tar.gz}" | ||
| SVS_RUNTIME_PREFIX="${SVS_RUNTIME_PREFIX:-$HOME/svs_runtime}" | ||
| mkdir -p "$SVS_RUNTIME_PREFIX" | ||
| curl -fsSL "$SVS_RUNTIME_URL" | tar -xz -C "$SVS_RUNTIME_PREFIX" | ||
| export CMAKE_PREFIX_PATH="$SVS_RUNTIME_PREFIX${CMAKE_PREFIX_PATH:+:$CMAKE_PREFIX_PATH}" | ||
| export LD_LIBRARY_PATH="$SVS_RUNTIME_PREFIX/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" |
There was a problem hiding this comment.
| # Install libsvs-runtime from the public release tarball (includes the | |
| # leanvec_primary_only API). Once a conda-forge package with this API is | |
| # published, this can revert to: | |
| # conda install -y /runtime_conda/libsvs-runtime-*.conda | |
| SVS_RUNTIME_URL="${SVS_RUNTIME_URL:-https://github.com/intel/ScalableVectorSearch/releases/download/nightly/svs_runtime-0.3.0-linux-x86_64-leanvec-primary-only-glibc228.tar.gz}" | |
| SVS_RUNTIME_PREFIX="${SVS_RUNTIME_PREFIX:-$HOME/svs_runtime}" | |
| mkdir -p "$SVS_RUNTIME_PREFIX" | |
| curl -fsSL "$SVS_RUNTIME_URL" | tar -xz -C "$SVS_RUNTIME_PREFIX" | |
| export CMAKE_PREFIX_PATH="$SVS_RUNTIME_PREFIX${CMAKE_PREFIX_PATH:+:$CMAKE_PREFIX_PATH}" | |
| export LD_LIBRARY_PATH="$SVS_RUNTIME_PREFIX/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" | |
| # Install libsvs-runtime from local conda package | |
| conda install -y /runtime_conda/libsvs-runtime-*.conda |
| conda activate svsenv | ||
| conda config --set solver libmamba | ||
| conda install -y -c conda-forge cmake=3.30.4 make=4.2 swig=4.0 "numpy>=2.0,<3.0" scipy=1.16 pytest=7.4 gflags=2.2 setuptools | ||
| conda install -y -c conda-forge cmake=3.30.4 make=4.2 swig=4.0 "numpy>=2.0,<3.0" scipy=1.16 pytest=7.4 gflags=2.2 setuptools curl |
There was a problem hiding this comment.
| conda install -y -c conda-forge cmake=3.30.4 make=4.2 swig=4.0 "numpy>=2.0,<3.0" scipy=1.16 pytest=7.4 gflags=2.2 setuptools curl | |
| conda install -y -c conda-forge cmake=3.30.4 make=4.2 swig=4.0 "numpy>=2.0,<3.0" scipy=1.16 pytest=7.4 gflags=2.2 setuptools |
rfsaliev
left a comment
There was a problem hiding this comment.
It seems like this API change is not convenient for existing API style:
Suggesting to use new values in StorageKind enum rather than adding extra arguments to index building routines.
| const VamanaIndex::SearchParams& default_search_params, | ||
| const VamanaIndex::DynamicIndexParams& dynamic_index_params | ||
| const VamanaIndex::DynamicIndexParams& dynamic_index_params, | ||
| bool leanvec_primary_only = false |
There was a problem hiding this comment.
Why didn't you just extend the StorageKind enum with values aka: LeanVec4, LeanVec8?
| using allocator_type = rebind_extracted_allocator_t<std::byte, Alloc>; | ||
| using type = LeanDatasetType<8, 8, allocator_type>; | ||
| }; | ||
|
|
There was a problem hiding this comment.
In case if:
StorageKindenum would haveLeanVec4,LeanVec8values andsvs::leanvec::LeanDatasetsupportsvoidfor theT2param, all changes in CPP runtime would include just following new lines:
// LeanVec Primary-Only Storage support
template <
size_t I1,
typename Alloc,
size_t LeanVecDims = svs::Dynamic,
size_t Extent = svs::Dynamic>
using LeanPrimaryOnly = svs::leanvec::LeanDataset<
svs::leanvec::UsingLVQ<I1>,
void,
LeanVecDims,
Extent,
Alloc>;
template <typename Alloc> struct StorageType<StorageKind::LeanVec4, Alloc> {
using allocator_type = rebind_extracted_allocator_t<std::byte, Alloc>;
using type = LeanPrimaryOnly<4, allocator_type>;
};
template <typename Alloc> struct StorageType<StorageKind::LeanVec8, Alloc> {
using allocator_type = rebind_extracted_allocator_t<std::byte, Alloc>;
using type = LeanPrimaryOnly<8, allocator_type>;
};And no other changes needed.
Summary
Adds a new
leanvec_primary_onlyparameter (defaultfalse) to the LeanVecbuild entry points in the C++ runtime API. When enabled, the LeanVec
secondary (full-precision) dataset is not allocated, roughly halving the
LeanVec memory footprint for workloads that don't need re-ranking.
API changes
A trailing
bool leanvec_primary_only = falseargument is added to:VamanaIndexLeanVec::build(both overloads)DynamicVamanaIndexLeanVec::build(theDynamicIndexParamsoverloads)Existing call sites are source- and ABI-compatible: the parameter is
defaulted, and the legacy back-compat overloads are unchanged.
Plumbing
The flag is stored in
{Vamana,DynamicVamana}IndexLeanVecImpland forwardedthrough
init_impl→build_impl→StorageFactory<LeanVec>::init→LeanDataset::reduce, where it skips secondary-dataset allocation anddisables re-ranking at search time.