Semantic search

What it means

Traditional keyword search (BM25, TF-IDF) ranks documents by how often query terms appear. Semantic search ranks by meaning: it embeds the query and the documents into the same vector space, then returns the documents whose vectors are closest. A query for "how do I cancel my subscription" matches a help article titled "Ending your plan" — zero word overlap, but the embeddings are neighbors. This is the core trick that makes RAG feel magical when it works. The embedding model has been trained (often via contrastive learning) to push semantically similar text close together in vector space and push unrelated text far apart. Cosine similarity or dot product over those vectors is enough to rank. Semantic search alone has known weaknesses. It's bad at exact-match queries — search "error E-1042" and the embedding might happily return articles about other errors that "feel similar." It can be fooled by surface fluency (a well-written but irrelevant paragraph beats a terse but correct one). And it inherits whatever biases the embedding model was trained on. That's why production systems pair semantic search with BM25 in a hybrid setup, then rerank. Under the hood, semantic search is what powers Perplexity-style answer engines, "smart" file search in Notion or Slack, and basically every "ask your documents" product. The user types a sentence; somewhere a vector DB does k-NN; the LLM gets the top results.

Example

Search "ways to reduce cloud spend" against an engineering wiki — semantic search returns articles titled "FinOps playbook" and "Rightsizing EC2 instances," which keyword search would miss because none contain the word "reduce."

Why it matters

Semantic search is the retrieval engine inside RAG. If you're picking embedding models, designing search UX, or debugging why your bot can't find an obvious document, you're doing semantic search work whether you call it that or not.

What it means

Example

Why it matters

Related terms

See it in a comparison