Learning Center

Vector Database Glossary

Clear definitions of the 20 most important concepts in vector search and production AI — from HNSW graphs to RAG pipelines, written for engineers who build real systems.

AI & ML

5 terms

Core concepts in AI, machine learning, and retrieval-augmented generation

Vector Database

AI & ML

A database built to find data by meaning rather than by exact match: the core storage layer behind modern AI applications like chatbots, search engines, and recommendation systems.

Read definition →

RAG

AI & ML

An AI architecture that looks up relevant information from a knowledge base before generating an answer, making AI responses more accurate, up to date, and grounded in real facts.

Read definition →

Embeddings

AI & ML

Lists of numbers that represent the meaning of data: the universal format that lets AI systems compare text, images, audio, and other content by how similar they are in meaning.

Read definition →

Context Window

AI & ML

The maximum amount of text an AI language model can read and use at once: understanding this limit explains why vector databases are essential for AI systems that need access to large knowledge bases.

Read definition →

Multimodal Retrieval

AI & ML

The ability to search across different types of data (text, images, audio, video) using a single unified system, including cross-type searches like finding images using a text description.

Read definition →

Indexing

6 terms

Data structures and algorithms that power fast vector similarity search

HNSW

Indexing

Hierarchical Navigable Small World: the most popular algorithm for fast similarity search in memory, offering an excellent balance between speed and accuracy.

Read definition →

IVF

Indexing

Inverted File Index: a cluster-based approach to fast similarity search that groups similar vectors together, making it highly memory-efficient and suitable for very large datasets.

Read definition →

Product Quantization

Indexing

A compression technique that dramatically shrinks the size of stored vectors by representing each one as a short sequence of code book references, making billion-scale search feasible on limited memory.

Read definition →

Scalar Quantization

Indexing

A simple compression technique that stores each number in a vector at lower precision, reducing memory usage by 4x with almost no loss in search accuracy.

Read definition →

ANN

Indexing

The technique at the heart of all vector search: deliberately skipping a small number of results in exchange for returning answers in milliseconds instead of seconds.

Read definition →

Index Types

Indexing

A practical guide to the five main ways to organize vector data for fast search, and how to choose the right one based on your dataset size, memory budget, and accuracy needs.

Read definition →

Search

7 terms

Techniques for high-precision, high-recall retrieval at production scale

Recall & Precision

The two metrics that measure how good a vector search result is: recall asks how many correct answers were found, and precision asks how many of the returned answers were actually correct.

Read definition →

Dense vs Sparse Vectors

Dense vectors capture meaning across all dimensions; sparse vectors capture keyword presence in only a few dimensions. Both have strengths, and combining them produces the best search results.

Read definition →

Hybrid Search

A search strategy that runs both meaning-based (semantic) search and keyword-based search at the same time, then combines the results to give more relevant answers than either approach alone.

Read definition →

Distance Metrics

The three mathematical ways vector databases measure similarity between vectors: each answers a slightly different question about how "close" two vectors are.

Read definition →

Filtered Search

Vector similarity search with additional hard constraints on data attributes, such as finding similar products that are also in stock, under a certain price, or belong to a specific category.

Read definition →

Re-ranking

A two-step search pattern where a fast initial search narrows candidates down, then a more precise model reorders them, combining the speed of vector search with deeper relevance analysis.

Read definition →

Semantic Search

Search that understands what you mean rather than just matching the words you typed, powered by AI embeddings that capture the intent behind a query.

Read definition →

Security & Edge

2 terms

Privacy-preserving and resource-constrained vector AI

Queryable Encryption

Security & Edge

A cryptographic technique that allows a server to search through encrypted data without ever seeing what the data actually contains, enabling AI on sensitive information with true privacy guarantees.

Read definition →

Edge AI

Security & Edge

AI that runs directly on a local device, such as a camera, robot, or medical scanner, rather than sending data to the cloud, enabling instant responses, offline operation, and on-device privacy.

Read definition →

Ready to put these concepts into practice?

Endee is the highest-throughput vector database available. Run it locally in one command.

Start for free View benchmarks

Vector Database Glossary

AI & ML

Vector Database

RAG

Embeddings

Context Window

Multimodal Retrieval

Indexing

HNSW

IVF

Product Quantization

Scalar Quantization

ANN

Index Types

Search

Recall & Precision

Dense vs Sparse Vectors

Hybrid Search

Distance Metrics

Filtered Search

Re-ranking

Semantic Search

Security & Edge

Queryable Encryption

Edge AI

All terms

Ready to put these concepts into practice?