Message 201 costs as much in input as messages 1-200 combined.
That single line is the thing nobody tells you before you build a business on Claude Cowork. The pricing page shows a flat monthly number. The reality is a curve, and the curve is steep on the right side.
Working through client Cowork rollouts, I care about this number the way a restaurant owner cares about the food bill. Here's the real economics, the variables that move it, and where the money actually goes.
Pro $20 vs Max vs API: The Real Ceiling
The official tiers look simple. They are not.
Pro at $20/mo unlocks the full Cowork feature set. Same plugins, same skills, same Live Artifacts as a $150 enterprise seat. The cap is usage, not capability. For hobby use or light personal ops, Pro will hold. For client work, it will not. You will hit the ceiling on a long session and watch the model degrade or rate-limit mid-task.
Max tiers exist for exactly this reason. Higher usage caps, same features. If you are running client deliverables, scheduled tasks, and multi-hour build sessions, Max is the honest tier. Anyone telling you Pro is enough for a real ops business has not stress-tested it under deadline pressure.
API spend is the other lane. Pay per token, model your own caps. Better for agencies running multiple clients, automation pipelines, or anything where you want predictable per-task economics instead of per-seat economics.
The strategic point: feature gating is not how Anthropic monetises Cowork. Usage is. So your real cost is a function of how you work, not which seat you bought. Which means the bill is a downstream artifact of architecture decisions, not a subscription you can shop for a better price on.
Where The Spend Actually Goes
If your bill is bigger than you expected, it is almost always one of four causes. I have seen these across client engagements and across every credit-burn post-mortem worth reading.
1. Subagent fan-out. Twenty parallel subagents on a task that needed two. This is the single biggest overspend pattern. Every subagent gets its own context, its own model call, its own tool invocations. Fan-out is powerful when the decision is hard to reverse. It is wildly wasteful on a 200-word edit. Match research depth to decision reversibility, every time.
2. Autocompact cascades on long sessions. The longer the session, the bigger the context, the more tokens each turn costs to resubmit. By message 201 you are paying as much in input as the first 200 combined. /compact every 30-45 minutes on long sessions cuts active context by 60-80%. Skip it and the cost curve bends against you.
3. MCP bloat. Every MCP server you connect injects its tool schemas into every turn. A single bloated server can add 18,000+ tokens per turn. Five connected servers and you are paying 90K tokens per message before you have typed anything. Most operators do not know which server is eating their context. Audit quarterly. Kill what you do not use.
4. Context resubmission on retries. Failed tool calls, retry loops, autocompact recoveries. Each one resubmits the whole context. A single misbehaving skill can double your daily spend without you noticing for a week.
The diagnostic question: If your monthly bill suddenly doubled, which of these four is the cause? If you cannot answer in under 60 seconds, you are not instrumenting your ops. That is the first fix.
The Benchmarks From Real Users
The most-shared practitioner data on Claude Code and Cowork spend lands here:
Enterprise average: roughly $13 per active developer per day. Active means days they actually used it, not calendar days. Spread across a month that lands somewhere between $150 and $250 per developer.
90% of users stay under $30 per day. The long tail above $30 is almost always running unsupervised pipelines, deep multi-hour Opus sessions, or aggressive subagent fan-out without discipline.
Pro $20 is a real ceiling for hobby use. Operators running client work need Max or API spend modelling. The honest line is: if your work pays for itself, your tooling tier should match the work, not the comfort.
Take these as anchors, not prescriptions. A solo ops operator running scheduled tasks and one or two client artifacts will sit on the low end. An agency with three engineers building production pipelines will sit higher. The shape is the same.
The numbers compress. The work behind them does not. Audit, schema design, failure-mode planning, rollback contracts, and multi-location edge-case handling cost the same whether your daily token bill is $5 or $50. The bill goes up when nobody is tuning the stack. The fix is not a cheaper subscription. It is somebody auditing skill load, MCP bloat, and subagent fan-out once a month against a known set of anti-patterns.
How I Budget For A Client Engagement
When I scope a six-week ops overhaul, the Cowork spend is a line item I plan for the same way I plan for any other tool cost. Here is the structure.
Build phase (weeks 1-4). Heaviest usage. Discovery synthesis, architecture, skill writing, plugin tuning, first artifacts. Plan for the upper benchmark, not the average. This is where Opus earns its keep on hard architectural calls and where Sonnet handles the volume of editing and execution. If you default everything to Opus here, your bill triples and the quality does not.
Tune phase (weeks 5-6). Spend drops sharply. Most of the work is verification, edge-case handling, and skill pruning. This is also where I do the MCP audit, kill servers that are not earning their tokens, and lock the skill library at the leanest possible set.
Retainer phase (ongoing). Spend stabilises into a predictable monthly band. New skills get written occasionally, plugins get retuned monthly, and the cron health check runs weekly. The token cost at this stage is dwarfed by the value of the system running unsupervised.
The economics work because the system replaces operational labour that was costing the client multiples more. Token spend is the cheapest line item in any ops engagement that is built correctly. Built correctly is doing the work in that sentence. The schema, the failure modes, the integration contracts, the rollback paths, the multi-location edge cases. Cowork writes the syntax fast once those are defined. Skip the definition step and the bill is the smaller of the two problems you are about to have.
I budget Cowork as a line item per client engagement. The line item assumes the system is tuned. If it is not tuned, the line item doubles and the output quality drops at the same time.
The Honest Bottom Line
Pro $20 is real for personal use and light tinkering. Max or API is the honest tier for client work. Most overspend traces back to four causes, all of which are fixable in under a day of disciplined work. The benchmarks land between $150 and $250 per active operator per month, and most of that is concentrated in build phases, not steady-state operations.
If you are running a business on Cowork and you do not know which of the four overspend causes is hitting you hardest, that is the first audit to book. Everything else compounds off the answer.
Want this running on your ops? Book a free 45-min ops mapping call. We'll audit your stack, find the bottlenecks, and show you where Cowork moves the needle. cal.com/formaum/45
Run on a stack that's holding you back?
Book a 45-minute discovery call. I'll map what moves, what stays, and what makes sense for your operation.
Book a call