Back to posts
AINews

Claude Sonnet 5 is cheap enough to be your default — check the token bill first

Anthropic shipped Sonnet 5 on June 30 at $2/$10 per million tokens through August, close to Opus 4.8 quality. The catch is a new tokenizer that counts up to 1.35x more tokens, so the rate card understates your real spend. Here is how to decide which work to move, how to set it up, and what to do before the price step on September 1.

Anthropic released Claude Sonnet 5 on June 30, and it is already the default model on Free and Pro. If you run agents, build with the API, or live in Claude Code, you have a decision in front of you: which work moves to Sonnet 5, what stays on Opus 4.8, and what the switch actually does to your bill.

The short version is that Sonnet 5 lands close to Opus 4.8 on agentic and knowledge-work evaluations and clears Sonnet 4.6 across the board, at a fraction of Opus pricing. The longer version has one number in it that the rate card hides. Here is the practical walk-through.

The pricing, and the part the rate card does not show

Sonnet 5 is $2 per million input tokens and $10 per million output tokens — but only through August 31. On September 1 it steps to $3 input and $15 output, the same list price Sonnet 4.6 carried.

That is the rate card. The number it does not show is the tokenizer. Anthropic updated how Sonnet 5 splits text into tokens, and the same input now counts as roughly 1.0 to 1.35x more tokens than before. So two jobs at the identical advertised rate can cost meaningfully different amounts, because Sonnet 5 sees more tokens in the same prompt.

The practical consequence: do not assume your per-task spend matches Sonnet 4.6 just because the rate looks similar. Run one real task on each, compare the actual token counts on your dashboard, and size your budget off the measured number, not the headline rate. On a long-context agent loop that re-reads the same files every turn, the tokenizer difference compounds fast.

Which work to move, and what to keep on Opus

The reason to care is that Sonnet 5 closes most of the gap to Opus 4.8 while costing a lot less per token. That changes the default.

Move to Sonnet 5:

  • Agent loops that run many turns. Tool use, file edits, search-and-fix cycles — the work where you were paying Opus rates per turn and feeling it. This is where the price difference shows up in your monthly bill.
  • Batch and background jobs. Summarizing, classification, extraction, anything you run over a queue. The cost per item drops and the quality holds.
  • Everyday coding and knowledge work where Opus was overkill but Sonnet 4.6 occasionally fell short.

Keep on Opus 4.8:

  • The hardest single-shot reasoning — a thorny architecture call, a proof, a gnarly debugging session where one wrong turn wastes an hour. Opus still leads at the top end, and for a one-off hard problem the few extra cents are not the cost that matters.

A useful middle setting: Anthropic notes Sonnet 5 reaches Opus 4.8 capability levels at medium reasoning effort. So for the borderline jobs, raise the effort to medium on Sonnet 5 before you reach for Opus. You often get the answer you wanted at the lower price.

Setting it up

If you are on Free or Pro in the Claude app, you are already on Sonnet 5 unless you pinned a model. Worth checking your default if you set one earlier.

In the API and in tools that take a model string, point at claude-sonnet-5. If your code hardcodes claude-sonnet-4-6 or an Opus id as the agent driver, that is the line to change. In Claude Code, switch the model with /model and set reasoning effort with the effort control rather than always running high — medium is the setting that buys you most of Opus at Sonnet price.

One thing to verify after you switch: your context budgets. Because the tokenizer counts more tokens, a prompt that fit comfortably before sits closer to your limits now. If you trim context to a fixed token count, re-measure it.

Before September 1

The $2/$10 window is the lever worth pulling deliberately. If you have heavy or batchable work that can run now rather than next quarter — a backfill, an eval suite over a large corpus, a one-time migration pass — running it before August 31 costs a third less on output than it will in September. Pull that work forward where the schedule allows.

After the step, Sonnet 5 still sits well under Opus, so the switch keeps paying off. The window just makes the front-loaded jobs cheaper than they will ever be again.

What to do this week

  • Run one representative task on Sonnet 5 and on your current model, and compare the real token counts, not the rate card.
  • Move your multi-turn agent loops and batch jobs to claude-sonnet-5; keep Opus 4.8 for the hardest one-shot reasoning.
  • For borderline jobs, try medium effort on Sonnet 5 before escalating to Opus.
  • Re-check your context budgets against the new tokenizer.
  • If you have batchable heavy work, schedule it before August 31 while output is $10 per million.

If you want the model-by-model picture next to GPT and Gemini, /tools tracks the current lineup, and /coding goes deeper on running multi-model agent workflows without burning the budget.

Get the next post when it ships

One email on Sunday with the new post and a short list of what shipped that week — new guides, tool updates, and a couple of links worth reading.