Claude Opus 4.8 Released: Anthropic's Most Capable Model Yet

May 29, 2026 · 5 min read

Committed to answers at your fingertips

Anthropic just released Claude Opus 4.8 — faster, more honest, and better at agentic tasks. Here's everything new and why it matters for developers.

Claude Opus 4.8 Released cover image showing headline and key highlights

Anthropic released Claude Opus 4.8 this week. It's the most capable model they've made generally available, building on Opus 4.7 with improvements across coding, reasoning, agentic tasks, and honesty. The price stays the same: $5 per million input tokens, $25 per million output tokens.

Here's what changed and what matters for developers building on top of it.

What Changed Since Opus 4.7?

Here's what actually changed:

1. Better Judgment and Honesty

Opus 4.8 is significantly less likely to make unsupported claims or let code flaws slip by unremarked. Anthropic's evals show it's roughly four times less likely than its predecessor to allow bugs in its own code to pass without flagging them. That's the kind of improvement that matters when you're trusting a model to work autonomously.

Early testers reported that it asks the right questions, catches its own mistakes, and pushes back when a plan doesn't make sense.

2. Stronger Agentic Performance

Benchmark comparison chart showing Claude Opus 4.8 vs Opus 4.7 vs GPT-5.5 across Super-Agent, CursorBench, Online-Mind2Web, and Legal Agent benchmarks

Opus 4.8 is the only model to complete every case end-to-end on Anthropic's Super-Agent benchmark, beating prior Opus models and GPT-5.5 at cost parity. On CursorBench, it exceeds previous Opus versions across every effort level, using fewer tool-calling steps for the same intelligence.

It's also the strongest computer-use and browser-agent model Anthropic has tested, scoring 84% on Online-Mind2Web.

3. Faster, More Efficient Tool Calling

The model is less likely to skip a tool call that a task requires, which was a known pain point with Opus 4.7. Long agentic traces also stay on task with fewer derailments after context compaction.

4. Adaptive Thinking That Actually Adapts

With adaptive thinking enabled, Opus 4.8 decides per turn whether reasoning is needed. Simple lookups get direct answers. Complex problems get reasoning before the answer. Fewer wasted tokens compared to Opus 4.7.

New Features Worth Knowing

Effort Control — Now on All Plans

A new control alongside the model selector lets users choose how much effort Claude puts into a response. Opus 4.8 defaults to high effort, with extra and max options for harder tasks. Rate limits in Claude Code have been increased to handle the higher token usage.

Fast Mode — 2.5x Speed, Lower Cost

Fast mode is now available for Opus 4.8 as a research preview on the Claude API. It delivers up to 2.5× higher output tokens per second at three times cheaper cost than previous models.

Mid-Conversation System Messages

The Messages API now accepts role: "system" entries inside the messages array. You can update Claude's instructions mid-task without breaking the prompt cache — useful when permissions or context change during an agentic loop.

Lower Prompt Cache Minimum

The minimum cacheable prompt length dropped to 1,024 tokens. Prompts that were too short to cache on Opus 4.7 now create cache entries without any code changes.

Real-World Benchmarks

Benchmark	Opus 4.8 Performance
Super-Agent	All cases completed end-to-end (only model to do so)
CursorBench	Exceeds all prior Opus models at every effort level
Online-Mind2Web	84% (strongest tested model)
Legal Agent Benchmark	Highest score recorded; first model to break 10% overall

Conceptual illustration of an AI agent autonomously working on a codebase with tools and workflow indicators

Opus 4.8 is strongest where long-horizon autonomy matters — coding agents, research agents, legal workflows, and enterprise knowledge work.

Pricing — Unchanged from Opus 4.7

Mode	Input	Output
Standard	$5 / 1M tokens	$25 / 1M tokens
Fast	$10 / 1M tokens	$50 / 1M tokens

Same price as Opus 4.7, with better performance. The model ID on the API is claude-opus-4-8. It supports the 1M token context window and 128k max output tokens.

What's Next: Mythos-Class Models

Anthropic also hinted at a new class of model with "even higher intelligence than Opus." A small number of organizations are already using Claude Mythos Preview for cybersecurity work through Project Glasswing. The company plans to bring Mythos-class models to all customers in the coming weeks, once safeguards are in place.

Why Model Diversity Matters

New AI models ship every week now. For developers building on top of them, the real question isn't which model is "the best" — it's which model is right for which task, and how to switch between them without friction.

That's the problem Felo AI tackles. Beyond its AI-powered search that pulls from advanced models for real-time answers, Felo offers an LLM Playground where you can call, test, and compare outputs from a wide range of leading models in one place. No juggling API keys, no switching between dashboards. Just pick a model, run your prompt, and see how it performs.

If you're evaluating models for your workflow, or just curious about what's out there, having them all in one interface makes the comparison process a lot less painful.

Try Felo AI for Free → https://felo.ai

What Changed Since Opus 4.7?​

1. Better Judgment and Honesty​

2. Stronger Agentic Performance​

3. Faster, More Efficient Tool Calling​

4. Adaptive Thinking That Actually Adapts​

New Features Worth Knowing​

Effort Control — Now on All Plans​

Fast Mode — 2.5x Speed, Lower Cost​

Mid-Conversation System Messages​

Lower Prompt Cache Minimum​

Real-World Benchmarks​

Pricing — Unchanged from Opus 4.7​

What's Next: Mythos-Class Models​

Why Model Diversity Matters​