Vector Database — Explained

Definition

A classical DB stores rows and queries by equality (WHERE name = 'X'). A vector DB stores vectors and answers "which 10 vectors are closest to this one?" in milliseconds — even across millions of vectors.

It does this via ANN (Approximate Nearest Neighbors) algorithms: HNSW, IVF, ScaNN. Doesn't guarantee exact results (approximate) but is 100× faster. For most use cases the trade-off is fine.

Popular options: Pinecone (managed), Weaviate, Qdrant, Chroma (self-host), pgvector (Postgres extension), Milvus, LanceDB. Hybrid vector + metadata filtering is common.

Analogy

A library: classical DB = alphabetical shelf (fast if you know the exact title). Vector DB = "books similar in topic to this one" shelf — finds by meaning, not keyword. And it scans a million books per second.

Real-world example

A startup has internal support docs: 5000 wiki pages, video transcripts, old tickets. You're building a support bot: 1. Chunk everything, embed it. 2. Load vectors into Qdrant (200K vectors, ~5GB). 3. An employee types "VPN connection issue": their query is embedded, Qdrant returns the closest 10 chunks in 50ms. 4. Those chunks become context for the LLM.

Try this with Postgres full-text search and it takes 2 seconds with less relevant results.

When to use

The retrieval layer of a RAG system
Semantic search
Recommendation engines (user/item vector matching)
Anomaly detection (vectors outside any cluster)
Visual/audio similarity search

When not to use

Exact-match lookups only — Postgres/MySQL is simpler
Fewer than 100K vectors — pgvector or in-memory FAISS is enough; a dedicated DB is overkill
Strict transactional guarantees — vector DBs are usually eventually consistent

Common pitfalls

Trusting similarity blindly

The top-10 isn't always right for the user. Add a reranker (cross-encoder) for more precise ordering.

Skipping metadata filtering

Fetching top-10 from 1M vectors and then filtering by WHERE user_id = X is wrong. Filter first, then rank — otherwise you can leak another user's data.

Underestimating index cost

Building HNSW for 1M vectors takes ~30 min and 10GB RAM. Don't do it on the live path; plan batch updates.