Hybrid Search
Keyword + semantic, combined
Classic keyword search (BM25) + vector-based semantic search, results merged. Covers each method's blind spot.
Pure vector search captures semantic closeness but sometimes misses exact-term matches like "GDPR" or "iPhone 16 Pro Max." Pure keyword search (BM25) nails exact terms but misses a "uyum" document when you query "compliance."
Hybrid search runs both and merges results. The standard merge method is Reciprocal Rank Fusion (RRF) — score each result by its rank in each list, sort by combined score. No score normalization headaches.
Almost every production RAG system is hybrid. Pinecone, Weaviate, Qdrant, Elastic — all ship with built-in hybrid modes.
Searching for books on "climate change." Keyword search returns titles containing the exact phrase. Semantic search returns "global warming" and "carbon footprint" too. Combine them: exact term + related concepts = the most comprehensive set.
A law firm's RAG. Query: "GDPR Article 17 compliance process."
Vectors only: "GDPR" is fuzzy in the embedding space ("data protection" scores similarly). May miss the exact Article 17 reference.
BM25 only: nails the exact term, but misses a doc that uses "uyum süreci" instead of "compliance process."
Hybrid: BM25 → 8 docs (containing "GDPR Article 17"). Vector → 12 docs (conceptually close). RRF merges → top 5 = exact match + concept matches together. Recall up ~30%, precision held.
- Building RAG — almost always go hybrid
- Domain-specific terms (code, product names, legal/medical terms)
- Multilingual content — vectors are English-skewed, BM25 is language-agnostic
- Highly varied queries — short keyword + long descriptive
- Tiny corpus (<1K docs) — plain vector is enough
- Pure semantic similarity (recommendations, clustering)
- Hard to maintain a BM25 index — extra ops complexity
Wrong score combination
Vector scores 0-1, BM25 unbounded. Adding them directly is wrong. Use RRF or min-max normalization. Or use Pinecone/Weaviate's built-in fusion.
Ignoring weight tuning
Default 50/50 BM25/vector — but your domain might want 30/70. A/B test: which queries favor BM25, which favor vectors?
Two indexes = two maintenance jobs
When a doc updates, both BM25 and vector indexes must update. Sync issues lead to inconsistent answers across the two paths.