Short answer. Sonnet 4.6 wins for everyday production code. Opus 4.6 wins for architecture, hard debugging, and anything where reasoning matters more than speed. Haiku 4.5 wins for mechanical edits, file ops, and high-volume small tasks. If you're using Opus for everything, you're paying 5x more than you need to. The trick isn't picking one model. It's routing tasks to the cheapest one that can do the job. At Formaum, I route Opus for reasoning, Sonnet for production code, and Haiku for mechanical tasks . And the system spends maybe 20% of what running Opus-only would.

The honest answer (it depends . Here's the matrix)

There is no single best model. There is a best model per task. The current Claude lineup splits cleanly across three jobs.

ModelSWE-bench VerifiedPrice (in / out per 1M tokens)Best for
Opus 4.680.8%$5 / $25Architecture, hard debugging, multi-file refactors
Sonnet 4.679.6%$3 / $15Everyday feature code, bug fixes, code review
Haiku 4.5~40%$1 / $5Mechanical edits, file ops, formatting, commit messages

Sonnet sits 1.2 points behind Opus on benchmarks and costs 60% less to run. That gap is the whole story. For most of the work I ship . CRM migrations, API integrations, full-stack features . Sonnet matches Opus output and saves me money on every prompt.

When Opus 4.6 wins

Opus is the one I reach for when reasoning is the bottleneck, not typing speed.

If a task takes more than 10 minutes to reverse when wrong, I use Opus. The extra cost is cheap compared to fixing a bad architectural call three weeks later.

When Sonnet 4.6 wins

Sonnet is the daily driver. It's what runs when I'm shipping client work and don't want to think about model selection.

In Anthropic's own testing, developers preferred Sonnet 4.6 over the previous Opus generation 59% of the time inside Claude Code. That number tracks with what I see.

When Haiku 4.5 wins

Haiku is the one most people sleep on. It's the difference between a $50 day and a $300 day on the API.

Haiku struggles on real coding (SWE-bench around 40%). But the tasks above aren't real coding. They're mechanical. Using Sonnet or Opus on them is paying a senior engineer rate to format a file.

Cost-per-task routing (my actual setup)

Here's the routing rule I run in production. It lives in my Claude Code config and applies to every session.

TaskModelWhy
Copy swaps, HTML edits, file editsHaikuMechanical. No reasoning.
File verification after an editHaikuJust confirming text matches.
Bash, git operationsHaikuExecution.
Initial HTML generation from specHaikuTemplate application.
Code review against a rules checklistHaikuChecklist, not judgment.
Everyday feature code, bug fixesSonnetDaily driver. Best ratio.
Test generation, refactors inside a moduleSonnetBounded reasoning.
Architecture, system designOpusReasoning matters.
Hard debugging, race conditionsOpusNeeds depth.
Factual accuracy verificationOpusNeeds to verify, not invent.

The result: a typical day routes 60% of calls to Haiku, 30% to Sonnet, 10% to Opus. The 10% is where the value gets created. The 60% is where the budget would have leaked if I'd run Opus on everything.

Tool integration: when Claude Code picks for you . And when to override

If you use Claude Code, the harness routes tasks across models automatically. It defaults to Sonnet for the main conversation, fires Haiku for cheap sub-agent calls (Bash, file reads, mechanical edits), and reserves Opus for explicit deep-reasoning work.

Override the default when:

The harness is smart. It's not psychic. If you know the task type, override.

Common mistakes (the ones that burn budget)

The bottom line

The best Claude model for coding isn't a model. It's a routing strategy. Sonnet for most of the work. Opus when reasoning is the constraint. Haiku for everything mechanical. The teams running this hybrid setup report 60-80% cost savings against running Opus straight through, with no quality drop on the work that matters.

Pick the cheapest model that can do the job. Override the default when you know the task type. Iterate in the cheapest format that holds the meaning. That's the whole game.

Run on a stack that's holding you back?

Book a 45-minute discovery call. I'll map what moves, what stays, and what makes sense for your operation.

Book a call

Frequently Asked Questions

Is Claude Sonnet 4.6 actually as good as Opus 4.6 for coding?
On benchmarks, Sonnet 4.6 scores 79.6% on SWE-bench Verified and Opus 4.6 scores 80.8%. A 1.2-point gap. In practice, Sonnet matches Opus on 80-90% of everyday coding work — features, bug fixes, single-file refactors. The gap shows up on architecture decisions, multi-file refactors with non-obvious dependencies, and hard debugging. For those, Opus is worth the extra cost. For the rest, Sonnet is the better economic choice.
When should I use Claude Haiku for coding?
Haiku is the right choice for mechanical work that does not require reasoning. File edits and verification. Copy swaps. Bash and git commands. Initial HTML generation from a spec. Commit messages. Lint-level formatting. Haiku struggles on real coding (around 40% SWE-bench), but those tasks are not real coding — they are pattern application. Using Sonnet or Opus on them wastes money.
What does it cost to run Claude on coding tasks?
Per million tokens, current pricing is Opus 4.6 at $5 input / $25 output, Sonnet 4.6 at $3 input / $15 output, and Haiku 4.5 at $1 input / $5 output. A real-world day routing 60% Haiku, 30% Sonnet, 10% Opus typically costs a third to a fifth of what running pure Opus would cost — for the same shipped output.
Does Claude Code pick the model for me, or do I need to set it?
Claude Code routes automatically. The main conversation defaults to Sonnet. Sub-agent calls for cheap operations (file reads, Bash, mechanical edits) run on Haiku. Opus is used when you explicitly ask for deeper reasoning. Override the default at the start of a session when you know the task type — for architecture, force Opus up front; for bulk mechanical work, force Haiku.
Why not just always use the most capable model?
Two reasons. First, cost. Opus is roughly 5x more expensive than Haiku at equivalent token volume. Running it on file verification or copy edits is paying a senior engineer rate for a clerk's job. Second, speed. Haiku runs at 80-100 tokens per second versus 30-40 for Opus. On a day of high-volume mechanical work, the speed difference alone changes how much you ship. Match the model to the task and you get more done for less money.
Genevieve Claire
Genevieve Claire
Founder, Formaum — Claude Code Expert & Full-Stack AI Engineer

Builds bespoke AI automation systems for multi-location operations. Previously EA Sports FIFA ($7B franchise) and Film/TV VFX on Skyfall, Avengers, Game of Thrones. Based in Vancouver, BC.