Index for ANN Search
Overview of Deep Lake's Index implementation for ANN search.
Last updated
Overview of Deep Lake's Index implementation for ANN search.
Last updated
Deep Lake implements the Hierarchical Navigable Small World (HSNW) index for Approximate Nearest Neighbor (ANN) search. The index is based on the OSS Hsnwlib package with added optimizations. The implementation enables users to run queries on >35M embeddings in less than 1 second.
Rapid index creation with multi-threading optimized for Deep Lake
Efficient memory management that reduces RAM usage
RAM Cost >> On-disk Cost >> Object Storage Cost
Minimizing RAM usage and maximizing object store significantly reduces costs of running a Vector Database. Deep Lake has a unique implementation of memory allocation that minimizes RAM requirement without any performance penalty:
The Index requires installation of:
pip install "deeplake[enterprise]"
By default, the index is turned off in Deep Lake. To enable the index, during Vector Store initialization or loading, specify the Vector Store length threshold above which the index will be applied:
The following limitations of the index are being implemented in upcoming releases:
Index does not support incremental updates. If any update is made to the dataset, the index is re-created.
If the search is performed using a combination of attribute and vector search, the index is not used and linear search is applied instead.