Claude Opus or Sonnet 4.6 — which should I use?

Default to Sonnet 4.6 for ~90% of production work. It handles complex reasoning, multi-step planning, agentic loops, and drafting at $3 input / $15 output per million tokens. Reserve Opus 4.8 ($5/$25 per million) for the specific tasks where Sonnet's quality genuinely doesn't suffice — research synthesis on long documents, complex multi-hop reasoning, ambiguous decisions where the cost of being wrong is high. Defaulting to Opus on every call costs roughly 1.6x more and rarely produces measurably better output on standard agent tasks.

How much more expensive is Claude Opus vs Sonnet?

Claude Opus 4.8 costs $5 per million input tokens and $25 per million output tokens. Claude Sonnet 4.6 costs $3 input and $15 output. On a typical mixed input/output workload, Opus runs about 1.6x more expensive than Sonnet per token. At production volume (10k+ requests/month), the difference compounds — a workload that costs $90/month on Sonnet runs $150/month on Opus. Whether that's worth it depends entirely on whether the task type benefits from Opus's quality margin.

What is Opus 4.8 actually better at than Sonnet 4.6?

Three task categories where Opus consistently outperforms Sonnet in production: (1) long-document reasoning where you need synthesis across 50k+ tokens of context — Opus holds the thread better; (2) ambiguous multi-step planning where the agent needs to weigh tradeoffs across many constraints; (3) hard debugging where the failure mode isn't obvious from the symptoms. For pattern-matching tasks (extraction, classification, drafting from a template), the quality difference is minimal.

What's the 70/20/10 routing pattern?

The production routing pattern that ships in serious Claude workloads: ~70% of tasks to Claude Haiku 4.5 ($1/$5 per million tokens) — classification, extraction, routing decisions, format conversion; ~20% to Claude Sonnet 4.6 ($3/$15) — the bulk of agent reasoning, drafting, multi-step planning; ~10% to Claude Opus 4.8 ($5/$25) — genuinely hard reasoning, multi-hop planning, anywhere quality compounds. At typical agent workload volume, this split runs roughly half the cost of Sonnet-only and a third of Opus-only, with comparable production quality on most workloads.

Should I use Opus or Sonnet for coding?

Default to Sonnet 4.6 for the bulk of coding work — implementing features, refactoring, fixing bugs where the problem is well-defined. Use Opus 4.8 for architectural decisions, ambiguous refactors where the right approach isn't obvious, and debugging hard race conditions or memory issues where understanding the system holistically matters. In Claude Code specifically, the pattern most serious developers settle into is: Sonnet as the working model, Opus when stuck. Switching mid-session is cheap and worth doing.

Is Opus worth it for Claude Code users?

Conditionally. Claude Code on Claude Max ($100/month) includes meaningful Opus access — enough for ~5-15% of your daily work depending on usage intensity. Using Opus for the right 5-15% of tasks (the hard architectural decisions, the genuinely ambiguous debugging) pays off in quality. Using it as your default model burns through Max's Opus envelope by mid-afternoon and gives you cap-blocked downtime. The advice: use Opus surgically, not by default.

What about Claude Haiku 4.5?

Underused. Haiku 4.5 at $1 input / $5 output per million tokens is the cheapest Claude model and handles the bulk of routine work — classification, extraction, format conversion, structured data parsing, simple replies. Most Claude users default to Sonnet for these tasks and pay 3x what they need to. The 70/20/10 routing pattern's biggest cost savings come from moving the ~70% of mechanical work that doesn't need Sonnet down to Haiku.

When does Opus 4.8 fail vs Sonnet?

Opus's failure mode is slowness. The model takes longer per token than Sonnet, so for high-volume mechanical work, Opus is the wrong choice on both cost and latency. It's also more conservative — Opus is more likely to ask clarifying questions or refuse ambiguous tasks where Sonnet would just attempt the work. For agentic loops that benefit from quick decisions, Sonnet often produces better end-to-end outcomes even though Opus's individual reasoning steps are deeper.

Does prompt caching change the Opus vs Sonnet math?

Yes, meaningfully. Prompt caching reduces input cost by ~90% on cache hits, so the input-token difference between Opus and Sonnet ($5 vs $3) shrinks dramatically when both benefit from caching. Output costs ($25 vs $15) remain the differentiator. For workflows that re-send the same context repeatedly (agentic loops, conversation interfaces, batch processing with shared system prompts), caching makes Opus's cost penalty smaller and the quality-per-dollar calculation favors Opus more often than the raw pricing suggests.

Claude Sonnet 4.6 vs older Sonnet versions — what changed?

Sonnet 4.6 (released February 2026) extended context handling to 1M tokens and improved agentic tool use vs the prior generation. The pricing stayed flat at $3/$15 per million. For coding workloads specifically, Sonnet 4.6 closed most of the gap with Opus 4.7 — many tasks that previously required Opus now ship reliably on Sonnet. The practical impact: the 70/20/10 routing split moved more work to Sonnet (and even Haiku) without quality loss, making naive Opus defaults more wasteful than they used to be.

Article · foundations

Claude Opus vs Sonnet 4.6 (2026): When Each Model Actually Wins

Claude Opus 4.8 costs 1.6x more than Sonnet 4.6 but isn't always 1.6x better. The 70/20/10 routing pattern, specific task types where each wins, and the cost math at production volume.

By Lucas Powell·June 23, 2026·8 min read·1,729 words

Claude Opus 4.8 costs 1.6x more than Sonnet 4.6 per token. The question that doesn't get asked often enough: is it 1.6x better on your workload? Usually no.

Anthropic's marketing emphasizes Opus as the flagship — and it is the most capable model on hard reasoning tasks. But Sonnet 4.6 has closed enough of the gap on standard agent work that defaulting to Opus is the most common cost mistake in production Claude deployments.

This guide covers the actual decision: per task type, when does Opus's quality margin justify the cost, and when does Sonnet (or Haiku) ship comparable results for a fraction of the spend.

The pricing in 2026

Model	Input	Output	Context	Tier
Claude Opus 4.8	$5/M	$25/M	1M tokens	Frontier
Claude Sonnet 4.6	$3/M	$15/M	1M tokens	Balanced
Claude Haiku 4.5	$1/M	$5/M	200k tokens	Value

The ratios that matter:

Opus costs 1.67x Sonnet per token on equal input/output mix
Sonnet costs 3x Haiku per token
Opus costs 5x Haiku per token

At production volume (10k tasks/month with mixed token sizes), a workload that costs $30 on Haiku runs $90 on Sonnet and $150 on Opus. The differences compound.

The question that determines whether those differences are worth paying: does the task genuinely benefit from the quality margin, or is it a pattern-matching task where any frontier-class model produces the same output?

Where Opus 4.8 genuinely wins

Three task categories where Opus consistently outperforms Sonnet in our production work:

1. Long-document synthesis (50k+ tokens of context). When the task is "read this 200-page report and produce a structured summary with cross-references," Opus holds the thread across the document better than Sonnet. Sonnet handles 1M context technically but reasoning quality degrades faster as context grows. For research synthesis, legal document analysis, and similar work where the whole context has to inform the output, Opus's quality margin shows up clearly.

2. Ambiguous multi-step planning. When the agent has to weigh tradeoffs across many constraints — "design this system to handle these five requirements with these three constraints under this budget" — Opus's deeper reasoning produces better plans. Sonnet's plans are competent but often miss tradeoffs that Opus catches.

3. Hard debugging where the symptoms don't reveal the cause. Race conditions, memory issues, distributed system failures where the bug is a system-level emergent property rather than a localized error. Opus's ability to hold the entire system in context while reasoning about possible failure modes consistently outperforms Sonnet on this class of problem.

These three categories make up roughly 5-15% of most production agent workloads. That's the right amount of Opus traffic to default to.

Where Sonnet 4.6 is the right answer

The 80-90% of work where Sonnet ships output indistinguishable from Opus:

Code implementation against a clear spec. "Implement this function to do X with these inputs and these outputs" — both models produce production-quality code. Sonnet ships it faster and cheaper.

Drafting content from briefs. Blog posts, email drafts, documentation, internal memos. Sonnet 4.6 is genuinely strong at long-form writing and the quality difference vs Opus is minimal.

Structured data extraction. Pulling structured fields from unstructured text — Sonnet handles this at near-perfect accuracy for most schemas. Opus doesn't measurably improve it.

Multi-step agent workflows with clear goals. When the agent's task is well-defined ("triage this support ticket, route it, and draft a response"), Sonnet's agentic capability is sufficient. The quality difference shows up only on edge cases that account for a small fraction of total traffic.

Conversational AI. Customer support agents, internal helpers, anything chat-shaped. Sonnet's conversational quality matches Opus for the user. Opus's marginally better answers don't justify 1.6x the cost at scale.

Where Haiku 4.5 is the actually-right answer (and most builders miss it)

The most common Claude-cost mistake in production isn't using Opus when Sonnet would do — it's using Sonnet when Haiku would do.

Tasks where Haiku 4.5 produces equivalent output for a third of Sonnet's cost:

Classification. "Is this email spam / sales inquiry / support request?" Haiku gets it right at near-perfect accuracy.
Routing decisions. "Which team should handle this ticket?" Haiku is sufficient.
Format conversion. "Reformat this CSV as JSON with this schema." Haiku ships it.
Short-form replies. "Draft a 50-word acknowledgment for this support request." Haiku writes this fine.
Tool-call routing. "Which of these 12 tools is the right one to call for this user request?" Haiku handles the dispatch decision; reserve Sonnet for the actual work.

Most builders default to Sonnet for these because Sonnet feels like the "safe" choice. At low volume that's fine. At 10k+ requests/month, it's leaving meaningful money on the table.

The 70/20/10 routing pattern

The production routing split that ships in serious Claude deployments:

~70% to Haiku 4.5 — classification, extraction, routing decisions, format conversion, short-form replies, structured data parsing
~20% to Sonnet 4.6 — the bulk of agent reasoning, drafting, multi-step planning, content generation
~10% to Opus 4.8 — genuinely hard reasoning, ambiguous decisions, multi-hop planning, long-document synthesis

A real cost example: a customer-support deflection agent processing 30,000 tickets/month with mixed token sizes.

Routing strategy	Monthly Claude spend
All Opus 4.8	~$900
All Sonnet 4.6	~$270
70/20/10 split	~$135

The 70/20/10 split costs half what a Sonnet-only strategy does and a sixth of an Opus-only strategy, with comparable production quality on the 30,000-ticket workload. Where the quality difference shows up: the 10% of tickets you reserve for Opus. Reserving Opus for the right 10% is the editorial work that makes the routing pay off.

When Opus is worth defaulting to

A few specific scenarios where defaulting to Opus is the right call:

Research and analysis agents. When the agent's job is reasoning across documents and producing synthesized insights, Opus's quality margin matters on every call. Don't try to cheap out here.

Architecture and design work. When the agent is making decisions that affect a system's long-term shape, the cost of being wrong is high. Opus's deeper reasoning is worth the premium.

Low-volume, high-value tasks. If the agent runs 100 times/month and each output goes to a senior decision-maker, the Opus premium is rounding error and the quality margin is real.

Tasks where the human review cost dominates the model cost. If a human spends 10 minutes reviewing each agent output, the model cost differential ($0.05 per request) is dwarfed by the human time. Use the better model.

The common thread: Opus wins when output quality has compounding value and the volume is low enough that cost doesn't matter much. Sonnet wins when volume is high and the quality difference doesn't move the outcome.

When Sonnet is wasted (downgrade to Haiku)

The opposite mistake — using Sonnet for tasks Haiku handles:

Single-turn classification with clear categories
Extracting structured fields from short text
Generating short replies from templates
Simple routing decisions
Reformatting structured data

If the agent's job is "look at X, output Y in this format" with no real reasoning required, Haiku is the right choice and Sonnet is overspend.

How to actually route in Claude Code

For Claude Code users specifically, model switching is built in. The /model command lets you change between Haiku, Sonnet, and Opus mid-session. The pattern most serious developers settle into:

Default to Sonnet for the working day
Switch to Haiku when doing high-volume mechanical work (reading files, running tests, processing structured output)
Switch to Opus when stuck — hard architectural decisions, debugging that's not yielding to standard approaches, or any task where you'd ask a senior engineer for a second opinion

On Claude Max ($100/month), all three tiers are available within the subscription envelope. Switching is cheap; the cost is realizing you should have switched.

How prompt caching changes the math

Prompt caching reduces input cost by ~90% on cache hits, so the input-token difference between Opus and Sonnet ($5 vs $3) shrinks dramatically when both benefit from caching. For workflows that re-send the same context repeatedly — agentic loops, conversation interfaces, batch processing with shared system prompts — caching makes Opus's cost penalty smaller.

The output cost difference ($25 vs $15) remains the differentiator. For workflows where the output is short (classification, structured extraction), caching makes Opus much more viable. For workflows where the output is long (drafting, synthesis), the output cost dominates and Sonnet's economic advantage stays large.

Practical impact: if you're using Claude through the API with prompt caching enabled and your output is short, Opus is more affordable than the raw pricing suggests. If your output is long, the math still strongly favors Sonnet for most work.

The decision rule

When you're deciding between Opus and Sonnet for a specific task:

Is the output's quality compounding? (Architecture, design, research, decisions with downstream consequences) → Opus
Is the task ambiguous or under-specified? → Opus
Is the input long (50k+ tokens) and reasoning across the whole context matters? → Opus
Is the task well-defined with a clear expected output shape? → Sonnet
Is the volume high (1k+ runs/day)? → Sonnet, and audit whether parts could move to Haiku
Is the task pattern-matching (classification, extraction, routing)? → Haiku, not Sonnet, not Opus

The mistake to avoid: defaulting to Opus because it's the "best" model. The cost compounds, the quality difference doesn't show up on most tasks, and you'll burn through your subscription envelope on work that didn't need the premium tier.

What to read next

The How to Avoid Hitting Claude Usage Limits guide covers the broader optimization patterns — prompt caching, model routing, subagents — that extend your effective Claude usage. The Claude Pro vs Max vs API decision guide covers when each Claude billing tier is the right one for your usage volume. The Claude Code vs Cursor comparison covers the tooling layer if you're picking a coding agent stack.

For broader model-routing patterns across providers, the AI agent model routing guide covers the Haiku/Sonnet/Opus split plus the equivalent decisions for OpenAI and Google models. The cost calculator lets you size any specific workflow against current Claude pricing.

About the author

Lucas Powell

Founder, Growth 8020 · Editor, Agent Shortlist

Founder of Growth 8020, an AI-first B2B marketing studio. Editor of Agent Shortlist — the publication he wished existed when his team had to pick AI tools.

Full bio →Growth 8020 ↗GitHub ↗