What recall means
Recall measures how complete your search results are. It answers the question: of all the truly relevant items in the database, how many did the search actually return?
Here is a concrete example. Imagine a query is submitted and there are 10 items in the database that are genuinely the closest matches. If the search returns 9 of those 10 correct items (and misses 1), recall is 90%. This notation is written as recall@10 (recall at 10 results). Recall of 100% means the search found every correct item; recall of 80% means 2 out of 10 correct items were missed.
For approximate nearest neighbor search, recall is the primary quality metric. A perfectly fast index that returns wrong results is useless. The goal is to tune the index so that recall stays above a target (typically 95 to 99%) while keeping response times low.
What precision means
Precision measures how relevant the returned results are. It asks: of the items the search returned, how many were actually relevant?
In vector search, when the number of returned results equals the number of correct answers (which is almost always the case), recall and precision are equal. Precision becomes important in scenarios where a system might return different numbers of results per query, or when a stricter definition of relevance is applied. For ranking quality beyond just "found or not found," metrics like Mean Reciprocal Rank (MRR) and Normalized Discounted Cumulative Gain (NDCG) measure whether the most relevant items appear at the top of the list.
Balancing accuracy and speed
Recall and speed are in tension: the more thoroughly an index searches, the higher the recall, but the longer each query takes. In production, the right balance is found through calibration. You test the index with representative queries at different search thoroughness settings and find the minimum setting that meets the recall target within the required response time.
For HNSW indexes, this thoroughness setting is called ef_search. For IVF indexes, it is called nprobe. Most production systems target 95 to 99% recall while staying under 10 milliseconds per query. Endee consistently achieves high recall with low latency, ranking among the best performers on independent vector database benchmarks.