In-context learning

What it means

In-context learning (ICL) is the surprising capability that makes LLMs feel general-purpose: you can teach them a new task by just showing them a few examples in the prompt. No fine-tuning, no training, no parameter updates. Show GPT or Claude three examples of "translate English to a made-up language" and it'll attempt the fourth one in your invented dialect. The model isn't actually learning in the traditional sense — its weights don't change — but it behaves as if it learned. This is what makes few-shot prompting work, and it's the core mechanism behind most prompt engineering. ICL emerged spontaneously as language models got bigger. Small models can't do it; somewhere around the GPT-3 scale, models began using examples in their context as task specifications instead of just text to continue. Why this happens is still actively researched. The leading theories involve the model implementing something like gradient descent inside the forward pass, or pattern-matching to similar tasks it saw during training, or building implicit task representations from the examples. Probably some combination of all three. The practical implications are huge. ICL is why you can prototype an LLM-powered feature in an afternoon — write a prompt with 3 examples, ship it. It's also why prompt engineering matters: the examples you choose, how you order them, and how you format them all affect what task the model "learns" to do. ICL is fragile and bounded (it can't teach the model truly new knowledge or capabilities — only how to apply what it already knows), but within those bounds it's the most flexible adaptation mechanism in machine learning.

Example

You want to extract product names from reviews. Instead of fine-tuning, you put 4 examples in your prompt: "Review: ... Product: AirPods Pro." The model sees the pattern and extracts product names from new reviews you paste in. You "trained" a classifier in 30 seconds, no GPU required.

Why it matters

In-context learning is what makes LLMs feel like a different kind of software. Traditional ML models do one thing; LLMs can do thousands of things if you describe them. Understanding ICL — what it can do, what it can't, when it breaks — is the difference between someone who treats LLMs as a chat box and someone who builds production systems with them.

What it means

Example

Why it matters

Related terms

See it in a comparison