AI Dictionary
Intermediate· ~1 min read#vector-db#rag#search

Vector Database

Similarity search store

A specialized database that stores vectors and searches by semantic similarity in milliseconds.

STORED IN A VECTOR DBKEYEMBEDDINGDIMcat[0.21, -0.45, 0.78, …]1536dog[0.18, -0.41, 0.81, …]1536car[-0.62, 0.33, -0.05, …]1536sun[0.04, 0.92, 0.11, …]1536finds the nearest vector among millions in milliseconds
Definition

A classical DB stores rows and queries by equality (WHERE name = 'X'). A vector DB stores vectors and answers "which 10 vectors are closest to this one?" in milliseconds — even across millions of vectors.

It does this via ANN (Approximate Nearest Neighbors) algorithms: HNSW, IVF, ScaNN. Doesn't guarantee exact results (approximate) but is 100× faster. For most use cases the trade-off is fine.

Popular options: Pinecone (managed), Weaviate, Qdrant, Chroma (self-host), pgvector (Postgres extension), Milvus, LanceDB. Hybrid vector + metadata filtering is common.

Analogy

A library: classical DB = alphabetical shelf (fast if you know the exact title). Vector DB = "books similar in topic to this one" shelf — finds by meaning, not keyword. And it scans a million books per second.

Real-world example

A startup has internal support docs: 5000 wiki pages, video transcripts, old tickets. You're building a support bot: 1. Chunk everything, embed it. 2. Load vectors into Qdrant (200K vectors, ~5GB). 3. An employee types "VPN connection issue": their query is embedded, Qdrant returns the closest 10 chunks in 50ms. 4. Those chunks become context for the LLM.

Try this with Postgres full-text search and it takes 2 seconds with less relevant results.

When to use
  • The retrieval layer of a RAG system
  • Semantic search
  • Recommendation engines (user/item vector matching)
  • Anomaly detection (vectors outside any cluster)
  • Visual/audio similarity search
When not to use
  • Exact-match lookups only — Postgres/MySQL is simpler
  • Fewer than 100K vectors — pgvector or in-memory FAISS is enough; a dedicated DB is overkill
  • Strict transactional guarantees — vector DBs are usually eventually consistent
Common pitfalls

Trusting similarity blindly

The top-10 isn't always right for the user. Add a reranker (cross-encoder) for more precise ordering.

Skipping metadata filtering

Fetching top-10 from 1M vectors and then filtering by WHERE user_id = X is wrong. Filter first, then rank — otherwise you can leak another user's data.

Underestimating index cost

Building HNSW for 1M vectors takes ~30 min and 10GB RAM. Don't do it on the live path; plan batch updates.