Learning Center
Vector Database Glossary
Clear definitions of the 20 most important concepts in vector search and production AI — from HNSW graphs to RAG pipelines, written for engineers who build real systems.
AI & ML
5 termsCore concepts in AI, machine learning, and retrieval-augmented generation
Vector Database
AI & MLA database built to find data by meaning rather than by exact match: the core storage layer behind modern AI applications like chatbots, search engines, and recommendation systems.
Read definition →
RAG
AI & MLAn AI architecture that looks up relevant information from a knowledge base before generating an answer, making AI responses more accurate, up to date, and grounded in real facts.
Read definition →
Embeddings
AI & MLLists of numbers that represent the meaning of data: the universal format that lets AI systems compare text, images, audio, and other content by how similar they are in meaning.
Read definition →
Context Window
AI & MLThe maximum amount of text an AI language model can read and use at once: understanding this limit explains why vector databases are essential for AI systems that need access to large knowledge bases.
Read definition →
Multimodal Retrieval
AI & MLThe ability to search across different types of data (text, images, audio, video) using a single unified system, including cross-type searches like finding images using a text description.
Read definition →
Indexing
6 termsData structures and algorithms that power fast vector similarity search
HNSW
IndexingHierarchical Navigable Small World: the most popular algorithm for fast similarity search in memory, offering an excellent balance between speed and accuracy.
Read definition →
IVF
IndexingInverted File Index: a cluster-based approach to fast similarity search that groups similar vectors together, making it highly memory-efficient and suitable for very large datasets.
Read definition →
Product Quantization
IndexingA compression technique that dramatically shrinks the size of stored vectors by representing each one as a short sequence of code book references, making billion-scale search feasible on limited memory.
Read definition →
Scalar Quantization
IndexingA simple compression technique that stores each number in a vector at lower precision, reducing memory usage by 4x with almost no loss in search accuracy.
Read definition →
ANN
IndexingThe technique at the heart of all vector search: deliberately skipping a small number of results in exchange for returning answers in milliseconds instead of seconds.
Read definition →
Index Types
IndexingA practical guide to the five main ways to organize vector data for fast search, and how to choose the right one based on your dataset size, memory budget, and accuracy needs.
Read definition →
Search
7 termsTechniques for high-precision, high-recall retrieval at production scale
Recall & Precision
SearchThe two metrics that measure how good a vector search result is: recall asks how many correct answers were found, and precision asks how many of the returned answers were actually correct.
Read definition →
Dense vs Sparse Vectors
SearchDense vectors capture meaning across all dimensions; sparse vectors capture keyword presence in only a few dimensions. Both have strengths, and combining them produces the best search results.
Read definition →
Hybrid Search
SearchA search strategy that runs both meaning-based (semantic) search and keyword-based search at the same time, then combines the results to give more relevant answers than either approach alone.
Read definition →
Distance Metrics
SearchThe three mathematical ways vector databases measure similarity between vectors: each answers a slightly different question about how "close" two vectors are.
Read definition →
Filtered Search
SearchVector similarity search with additional hard constraints on data attributes, such as finding similar products that are also in stock, under a certain price, or belong to a specific category.
Read definition →
Re-ranking
SearchA two-step search pattern where a fast initial search narrows candidates down, then a more precise model reorders them, combining the speed of vector search with deeper relevance analysis.
Read definition →
Semantic Search
SearchSearch that understands what you mean rather than just matching the words you typed, powered by AI embeddings that capture the intent behind a query.
Read definition →
Security & Edge
2 termsPrivacy-preserving and resource-constrained vector AI
Queryable Encryption
Security & EdgeA cryptographic technique that allows a server to search through encrypted data without ever seeing what the data actually contains, enabling AI on sensitive information with true privacy guarantees.
Read definition →
Edge AI
Security & EdgeAI that runs directly on a local device, such as a camera, robot, or medical scanner, rather than sending data to the cloud, enabling instant responses, offline operation, and on-device privacy.
Read definition →
Ready to put these concepts into practice?
Endee is the highest-throughput vector database available. Run it locally in one command.