Indexing

    Vector index types explained

    A practical guide to the five main ways to organize vector data for fast search, and how to choose the right one based on your dataset size, memory budget, and accuracy needs.

    Flat index: exact search, no shortcuts

    A flat index is the simplest approach: store every vector as-is and compare the query to every single stored vector at search time. This guarantees 100% accuracy because no approximations are made. Every result is definitively the closest match.

    The downside is speed. Comparing a query against every vector takes time proportional to the number of vectors. This is perfectly acceptable for small collections (under about 500,000 items) but becomes too slow for production systems at larger scale. Flat indexes are most useful as a reference baseline, and for test environments where correctness matters more than speed.

    HNSW and DiskANN: graph-based indexes

    HNSW (Hierarchical Navigable Small World) is the most widely used index for collections that fit in memory. It builds a multi-layer navigation structure that lets queries jump to the right region of the data very quickly, achieving high accuracy with low latency. It is the best choice for most production deployments up to a few hundred million vectors.

    DiskANN, developed by Microsoft Research, extends graph-based search to datasets that are too large to fit in memory. It stores most of the index structure on a fast storage drive (SSD) and keeps only a small working set in memory. This allows billion-scale collections on hardware with modest memory, at the cost of slightly higher response times due to storage access.

    IVF and IVF-PQ: cluster-based indexes

    IVF (Inverted File Index) groups vectors into clusters and only searches the most relevant clusters for each query, skipping the rest. It uses less memory than HNSW and is easier to update incrementally: new vectors can be added without rebuilding the entire index. It works well for medium to large datasets and is a practical choice when memory is limited.

    IVF-PQ combines IVF clustering with additional compression of the vectors themselves. This allows very large datasets (hundreds of millions to billions of vectors) to fit into far less memory than would otherwise be needed. The trade-off is some reduction in accuracy compared to uncompressed indexes, but this is acceptable for applications at this scale.

    Choosing the right index for your situation

    The choice of index comes down to three factors: dataset size, available memory, and accuracy requirements.

    For small datasets (under one million vectors), a flat index for exact results or HNSW for fast approximate results are both good options. For medium datasets (one million to one hundred million vectors), HNSW with scalar quantization (for memory savings) is typically the best balance. For large datasets (hundreds of millions of vectors), IVF-PQ offers memory efficiency with manageable accuracy loss. For truly massive datasets (billions of vectors), DiskANN makes the search feasible on commodity hardware by using storage instead of requiring it all in memory.

    Related concepts

    Put Index Types to work with Endee

    The highest-throughput vector database — 1,168 QPS on 4 CPUs. Free to start.