All terms
RAG & retrieval
Vector database
Also known as: vector store, vector index, ANN database
A specialized database for storing embeddings (high-dimensional vectors) and finding the nearest neighbors to a query vector — fast.
What it means
A vector database stores embeddings — typically arrays of 768 to 3,072 floating-point numbers per item — and answers one core question very quickly: "given this query vector, which N vectors in the index are closest?" That's k-nearest-neighbor (k-NN) search over high-dimensional space, and doing it in milliseconds across millions or billions of vectors is the entire reason these databases exist.
Under the hood, exact k-NN is too slow for production scale, so vector DBs use approximate nearest neighbor (ANN) algorithms — HNSW, IVF, ScaNN — that trade a tiny amount of recall for huge speedups. They also handle the things real systems need: metadata filtering (only search vectors tagged with user_id=42), hybrid search (combine vector similarity with keyword scores), namespaces, and incremental indexing.
The 2026 landscape: Pinecone for managed simplicity, Qdrant and Weaviate for self-hosted production workloads, Chroma for prototyping and small projects, pgvector for "I already have Postgres and don't want another system." Cloud providers also have native offerings (AWS OpenSearch with k-NN, Azure AI Search, Vertex AI Vector Search). For under a million vectors, you genuinely don't need a dedicated vector DB — pgvector or even an in-memory FAISS index is fine.
A vector DB is a component, not a strategy. It does retrieval, not understanding. The quality of your search is bounded by your embedding model and your chunking; the vector DB just makes the lookup fast.
Example
You embed 2 million help-center paragraphs into 1,536-dim vectors with OpenAI, store them in Pinecone, and at query time the DB returns the 20 nearest paragraphs in ~30ms.
Why it matters
Every RAG system needs somewhere to put the embeddings. Picking the right vector DB shapes your latency, cost, and operational burden. The wrong choice (Pinecone for 10k docs) wastes money; the wrong choice (Chroma for 100M docs) crashes in production.