Claude Opus 4.8 Is Live: Dynamic Workflows, Effort Control, 3× Cheaper Fast Mode — What Changes for Claude Code Users
Anthropic shipped Claude Opus 4.8 on 2026-05-28. Headline numbers: SWE-bench Pro 69.2% (up from 64.3%), USAMO math 96.7% (up from 69.3%), GraphWalks 1M-token recall 68.1% (up from 40.3%). The bigger story is the new agentic primitives — Dynamic Workflows, explicit effort control, and a fast mode that's 2.5× faster and 3× cheaper. Here's the practical breakdown for anyone running Claude Code, Cursor, or an MCP-heavy stack.
Published 2026-05-28
TL;DR
- Model:
claude-opus-4-8. Drop-in replacement forclaude-opus-4-7. - Pricing: Unchanged — $5/$25 per 1M input/output (standard), $10/$50 fast mode at 2.5× speed.
- Best at: Long-horizon agentic coding (SWE-bench Pro 69.2%), math (USAMO 96.7%), 1M-token long-context recall (68.1% GraphWalks F1).
- New agentic primitive: Dynamic Workflows — Claude plans, spawns parallel subagents in a single Claude Code session, verifies, then reports.
- New control: Explicit effort parameter (
low | medium | high | xhigh | max) — choose your token-vs-quality tradeoff per call. - Honesty axis: Anthropic claims Opus 4.8 is its 'most honest' model — more likely to flag uncertainty, less likely to make unsupported claims. Independent benchmarks support this.
What the numbers actually say
| Benchmark | Opus 4.7 | Opus 4.8 | Delta |
|---|---|---|---|
| SWE-bench Pro (agentic coding) | 64.3% | 69.2% | +4.9pt |
| SWE-bench Verified | 87.6% | 88.6% | +1.0pt |
| USAMO 2026 (math) | 69.3% | 96.7% | +27.4pt |
| GraphWalks F1 @ 1M tokens | 40.3% | 68.1% | +27.8pt |
| Multidisciplinary reasoning + tools | 54.7% | 57.9% | +3.2pt |
| Agentic computer use | 82.8% | 83.4% | +0.6pt |
| Knowledge work (composite) | 1753 | 1890 | +137 |
The two outliers — USAMO and GraphWalks — are not coincidences. USAMO reflects deep-reasoning training payoffs. GraphWalks measures how well a model uses its 1M-token context to do *real* multi-hop retrieval, not just say 'the answer is in the haystack somewhere'. Opus 4.8 jumped 28 points there. For codebase-scale work where the agent has to keep dozens of files coherent, this matters more than the SWE-bench number.
Anthropic positions Opus 4.8 ahead of GPT-5.5 and Gemini 3.1 Pro on most agentic-coding metrics, with the exception of agentic terminal coding where OpenAI's model is still ahead.
Dynamic Workflows: the headline agentic feature
Dynamic Workflows (currently in research preview, Enterprise/Team/Max plans) lets Claude Code take on tasks that previously didn't fit in a single session. The pattern:
- Plan. Claude reads the goal, the repo, and produces a structured plan of work units.
- Spawn. It launches hundreds of parallel subagent sessions, each with its own scoped task and tools.
- Verify. Outputs come back; Claude runs the existing test suite (or whatever bar you set) as the acceptance criterion.
- Report. Only after verification does Claude consolidate and present the result.
Anthropic's stated benchmark: codebase-scale migrations across hundreds of thousands of lines of code, from kickoff to merge, with the test suite as the validation gate. For monorepo upgrades — TypeScript 4→5, React 18→19, deprecated-API replacements — this is the first agent primitive that maps directly onto a problem teams actually have.
Effort control: stop overspending on easy tasks
Before 4.8, you bought a fixed effort budget per call — and that budget was high. For 'rename this variable' or 'translate this sentence', you were paying for max-effort reasoning the task didn't need.
Opus 4.8 exposes effort as an explicit parameter. The Claude Code menu has low | medium | high (default) | xhigh | max. The API takes the same values. The model defaults to high, which Anthropic says is its best balance — but for mechanical tasks dropping to low cuts token spend dramatically with no measurable quality loss.
low: mechanical edits, small refactors, simple Q&A. Roughly 1/3 the token spend of high.medium: standard coding tasks, debugging, mid-size refactors.high*(default)*: most of what you do today. Don't overthink this.xhigh: architecture decisions, hard debugging, multi-file changes that need to all hang together.max: research problems, math, the kind of thing you'd run overnight.
Users on claude.ai get this too — a slider in the UI rather than an API parameter. Same idea.
Fast mode: 2.5× speed, 3× cheaper than before
Fast mode pricing stays at $10 / $50 per 1M tokens — but the *throughput* is 2.5× the standard mode, and Anthropic claims it's 3× cheaper than fast mode was on prior Opus generations. For interactive Claude Code use, this is the right default. Reach for standard or xhigh when you actually need the depth.
One caveat: fast mode trades latency for slightly shallower planning. For mechanical Claude Code work it doesn't matter. For multi-step debugging where the agent needs to backtrack, standard mode is still measurably better.
'Most honest model yet': what that actually means
Anthropic's framing for Opus 4.8 is sharper judgment + better self-reporting. Concretely:
- Higher rate of flagging 'I'm not sure' when it actually isn't sure — instead of confidently inventing.
- Better at saying 'I haven't completed X yet' mid-task — instead of declaring success and letting you discover later.
- Less prone to confabulating tool outputs when an MCP call fails — it surfaces the failure rather than hallucinating a plausible result.
The third bullet is the one that matters most for MCP-heavy workflows. Pre-4.8, a flaky MCP server could result in plausibly-wrong agent output. 4.8 is more likely to halt and surface the underlying error. Expect cleaner failure modes when an integration is having a bad day.
Sleeper feature: update instructions mid-task without breaking the cache
Pre-4.8, changing the system prompt or developer instructions invalidated the prompt cache — which is the main reason 5+ minute Claude Code sessions get expensive. 4.8 lets you update Claude's instructions mid-task and keeps the cache warm. If you build agentic harnesses with a 'refine the goal as you go' pattern, your bill just dropped.
What Claude Code users should do today
- Update Claude Code (
claude --versionto check). The CLI picks upclaude-opus-4-8automatically if you have access. - Try fast mode (
/fastin the Claude Code prompt) for your normal coding work for a day. See if you notice a quality drop. Most won't. - Set
--effort lowfor one-off scripts and mechanical edits. Bank the savings. - Test Dynamic Workflows on a non-critical migration if you're on Enterprise/Team/Max. Pick something with good test coverage; the test suite IS the safety net.
- Re-run any agent harness benchmarks you maintain. Effort control changes the meaning of your historical metrics; rebaseline now.
Implications for the MCP ecosystem
- More tool calls per session. Dynamic Workflows + better long-context recall means agents reach for more MCP tools, more often. Server authors should review their rate-limiting and connection-pool behavior.
- Cleaner failure surfaces. The honesty improvements mean flaky MCP servers will get *visibly* flaky — Claude won't paper over them. If you author a server, your error messages now reach the user more reliably. Make them good.
- Bigger tool catalogs are now viable. With 1M-token long-context recall jumping to 68%, fleets of 40+ MCP servers loaded simultaneously stop being a context-overflow concern. Pair this with Claude Code's tool search and lazy loading and you can finally run your real toolbelt without thinking about it.
- Pin your MCP versions anyway. Opus 4.8 doesn't change the supply-chain attack surface. See our Nx Console and codexui-android post-mortems.
Bottom line
Opus 4.8 is a quiet release with two loud features. The numbers are good — but the practical wins for daily Claude Code use are (1) fast mode being the right default at the new price, (2) explicit effort control letting you stop overspending on easy tasks, and (3) Dynamic Workflows finally making codebase-scale migrations a single-session ask. If you're an MCP power user, the long-context recall jump is the under-discussed one — load more servers, trust the recall more.
FAQ
Do I have to do anything to use Opus 4.8 in Claude Code?
claude --version to confirm. The model ID is claude-opus-4-8.Is fast mode safe for production agentic work?
xhigh effort) is still the safer choice. A/B test on your own workflows before standardizing.Will Dynamic Workflows blow up my MCP server bills?
What does 'most honest model yet' mean for hallucinations?
Is Opus 4.8 better than GPT-5.5 or Gemini 3.1 Pro?
What's the model ID for the API?
claude-opus-4-8. Pricing is unchanged from 4.7. The 1M-context variant is available as claude-opus-4-8[1m] on supported tiers.