Multi-agent system

What it means

A multi-agent system uses two or more LLM agents with distinct roles, prompts, and tools that pass work between each other. Common shapes: a planner that decomposes a goal into subtasks and delegates to specialists; a debate where two agents argue and a judge picks the winner; a pipeline where each agent owns a stage (research → draft → critique → revise). Frameworks like CrewAI, AutoGen, LangGraph, and Anthropic's "subagents" pattern make this easy to wire up. The intuitive pitch is appealing: specialization beats generality, just like in human teams. In practice the gains are narrower than the hype suggests. Most "multi-agent" demos can be replicated by one good model with a structured prompt, at a fraction of the latency and cost. Every additional agent adds round trips, token spend, error compounding, and a new place for things to go wrong. Anthropic's own write-ups and various 2025 evals consistently show: simpler is usually better, and a well-designed single-agent loop beats a poorly-coordinated multi-agent swarm. Where multi-agent earns its keep: long-horizon work where context doesn't fit in one window (subagents each handle a slice and report back); tasks with genuinely orthogonal skills (a coder agent + a security-review agent + a deploy agent); and self-critique patterns where one agent's output is checked by another with a different prompt and the cycle measurably improves quality. Anthropic's own Claude Code uses subagents for parallel exploration of large codebases — that's a real use case, not a demo. The 2025-2026 trend is "subagents" — the same model delegated to itself with a fresh context window for a focused subtask. This sidesteps most of the coordination overhead because there's only one underlying personality, just multiple working sessions. It's the version of multi-agent that's actually shipping in production.

Example

A research agent in Claude Code spawns three subagents to explore three subdirectories of a large repo in parallel. Each returns a summary, the parent merges them and writes the answer. Same model, three context windows — much faster than scanning serially.

Why it matters

Multi-agent is the most overhyped pattern in 2026 agent design. Knowing when it actually helps (parallel exploration, true specialization, self-critique) versus when it's just orchestration overhead with a fancy diagram saves you a lot of token spend and debugging pain.

What it means

Example

Why it matters

Related terms

See it in a comparison