Search

    What is hybrid search?

    A search strategy that runs both meaning-based (semantic) search and keyword-based search at the same time, then combines the results to give more relevant answers than either approach alone.

    Two searches running in parallel

    Hybrid search runs two independent searches over the same data simultaneously. The first is a semantic search (using dense vectors) that finds items similar in meaning to the query. The second is a keyword search (using sparse vectors, typically BM25) that finds items containing the exact words in the query. Both searches return a ranked list of results, and a merging algorithm combines them into a single, final list.

    The combination takes slightly longer than either search alone, typically 10 to 30 milliseconds extra, but the improvement in result quality is almost always worth it.

    How the two result lists are merged

    The most common merging approach is called Reciprocal Rank Fusion (RRF). It scores each item based on how highly it was ranked in each individual search, regardless of what score that system assigned. Items that appear near the top of both lists receive a high combined score; items that appear in only one list or near the bottom receive lower scores.

    RRF is popular because it works consistently across different types of queries and domains without needing manual tuning for each use case. An alternative approach is to assign a weight (such as 70% to semantic search and 30% to keyword search) and combine the normalized scores from each system accordingly.

    Where hybrid search adds the most value

    Hybrid search is especially valuable in situations where queries are unpredictable. In e-commerce, one shopper might search by description ("lightweight laptop for travel") while another searches by model number ("ThinkPad X1 Carbon Gen 11"). In enterprise document search, employees might ask conceptual questions or search for specific contract clause numbers. Technical support queries might describe a problem in natural language or paste a specific error code.

    Pure semantic search handles the conceptual queries; pure keyword search handles the exact terms. Hybrid search handles both reliably, making it the safer default for any system where you cannot predict how users will phrase their searches.

    Related concepts

    Put Hybrid Search to work with Endee

    The highest-throughput vector database — 1,168 QPS on 4 CPUs. Free to start.