Vector Database
Similarity search store
A specialized database that stores vectors and searches by semantic similarity in milliseconds.
A classical DB stores rows and queries by equality (WHERE name = 'X').
A vector DB stores vectors and answers "which 10 vectors are closest
to this one?" in milliseconds — even across millions of vectors.
It does this via ANN (Approximate Nearest Neighbors) algorithms: HNSW, IVF, ScaNN. Doesn't guarantee exact results (approximate) but is 100× faster. For most use cases the trade-off is fine.
Popular options: Pinecone (managed), Weaviate, Qdrant, Chroma (self-host), pgvector (Postgres extension), Milvus, LanceDB. Hybrid vector + metadata filtering is common.
A library: classical DB = alphabetical shelf (fast if you know the exact title). Vector DB = "books similar in topic to this one" shelf — finds by meaning, not keyword. And it scans a million books per second.
A startup has internal support docs: 5000 wiki pages, video transcripts, old tickets. You're building a support bot: 1. Chunk everything, embed it. 2. Load vectors into Qdrant (200K vectors, ~5GB). 3. An employee types "VPN connection issue": their query is embedded, Qdrant returns the closest 10 chunks in 50ms. 4. Those chunks become context for the LLM.
Try this with Postgres full-text search and it takes 2 seconds with less relevant results.
- The retrieval layer of a RAG system
- Semantic search
- Recommendation engines (user/item vector matching)
- Anomaly detection (vectors outside any cluster)
- Visual/audio similarity search
- Exact-match lookups only — Postgres/MySQL is simpler
- Fewer than 100K vectors — pgvector or in-memory FAISS is enough; a dedicated DB is overkill
- Strict transactional guarantees — vector DBs are usually eventually consistent
Trusting similarity blindly
The top-10 isn't always right for the user. Add a reranker (cross-encoder) for more precise ordering.
Skipping metadata filtering
Fetching top-10 from 1M vectors and then filtering by WHERE user_id = X is wrong. Filter first, then rank — otherwise you can leak another user's data.
Underestimating index cost
Building HNSW for 1M vectors takes ~30 min and 10GB RAM. Don't do it on the live path; plan batch updates.