Knowledge cutoff

What it means

Every base LLM has a knowledge cutoff — the date the training corpus was frozen. Ask GPT-4-class or Claude-class models about events, prices, sports scores, or product releases after that date and you get one of three responses: an honest "I don't know," a stale answer based on the last data they saw, or a confidently-wrong fabrication. The third is the dangerous one. Cutoffs in 2026 are typically 6 to 18 months behind the current date for frontier models, even longer for smaller or open-source models. Vendors update them, but slowly — pre-training runs are expensive, and they prioritize capability over recency. Some models have a "knowledge cutoff" that's earlier than their "training cutoff" because the most recent data was used for safety/RLHF rather than absorbed as world knowledge. This is the structural reason RAG matters. Even a perfect frontier model can't tell you about yesterday's earnings call, this morning's PR release, or your private wiki — none of that was in the training data. RAG (and tool use, and web search) are how you make a frozen-knowledge model useful for real-time questions. Search-grounded products like Perplexity and ChatGPT's web mode exist almost entirely to paper over the cutoff. Failure mode to know: models will often give you a plausible answer about a "recent" event that's actually older than they realize, because they don't know what today's date is unless you tell them. Always pass the current date in the system prompt for any time-sensitive task, and pair frontier models with retrieval (or web search) for any "current" information.

Example

Ask a model with a January 2025 knowledge cutoff 'who won the 2026 Super Bowl?' and it will either decline, guess based on 2024 trends, or confidently invent a winner. Wire it to web search and it answers correctly.

Why it matters

The cutoff is why every serious AI product layers retrieval, web search, or tool use on top of the base model. Understanding it is also the antidote to a lot of magical thinking — these models are smart, but they're not omniscient and they're not live.

What it means

Example

Why it matters

Related terms

See it in a comparison