Indexing

    What is approximate nearest neighbor (ANN) search?

    The technique at the heart of all vector search: deliberately skipping a small number of results in exchange for returning answers in milliseconds instead of seconds.

    Why checking everything is too slow

    The most obvious way to find the most similar item in a database is to compare your query to every single item and pick the closest ones. This works fine when the database contains a few thousand items, but it becomes impractically slow at production scale.

    Consider a database of 100 million product images, each represented as a 768-number vector. Finding exact nearest neighbors requires computing 100 million distances for every search query. Even on a fast server, this takes over a second per query. Most AI applications need results in under 10 milliseconds, making exact search completely infeasible for large datasets.

    The approximate trade-off

    Approximate Nearest Neighbor (ANN) search solves this by building a smart data structure during an offline preparation step (called indexing). This structure organizes the data so that at search time, you can navigate directly to the right region of the data and check only a small fraction of the items, rather than checking all of them.

    The "approximate" part means that on rare occasions, ANN might miss one or two of the absolute closest results. But in practice, the results are nearly perfect. An ANN search that finds 99 out of 100 truly relevant items while running 1,000 times faster than an exact search is an excellent trade-off for almost every real-world application. The common index types used for ANN search are HNSW and IVF.

    How ANN performance is measured

    ANN algorithms are evaluated on two things: how fast they respond (queries per second, or QPS) and how often they find the truly correct results (called recall). Recall is measured by comparing ANN results to the exact results computed by brute force on the same query.

    Benchmark projects like ann-benchmarks.com and VectorDBBench measure both metrics on standard datasets so that different systems can be fairly compared. HNSW consistently leads in-memory benchmarks for the best combination of speed and recall. Endee's indexing engine consistently ranks among the top performers on VectorDBBench across multiple dataset sizes.

    Related concepts

    Put ANN to work with Endee

    The highest-throughput vector database — 1,168 QPS on 4 CPUs. Free to start.