Claude Fable 5: What It Is, Benchmarks, Pricing, and Why It Matters
·AI News·Sudeep Devkota

Claude Fable 5: What It Is, Benchmarks, Pricing, and Why It Matters

Claude Fable 5 brings Anthropic's Mythos-class capability to general availability, with top benchmark scores, strict safeguards, 1M-token context, and a new safety tradeoff for frontier AI users.


Claude Fable 5: What It Is, Benchmarks, Pricing, and Why It Matters

Claude Fable 5 is Anthropic's first generally available Mythos-class model. That is the important sentence.

Until now, Mythos-class Claude models were treated as controlled-access systems because their cyber and scientific capabilities crossed a risk threshold. With Fable 5, Anthropic is making that capability broadly available, but with a major constraint: safety classifiers route high-risk requests away from Fable 5 and toward Claude Opus 4.8.

The result is not just a bigger model launch. It is a release pattern for frontier models that are powerful enough to be useful in long-horizon coding, knowledge work, research, and automation, but sensitive enough that access policy becomes part of the product.

Source trail

This article uses Anthropic's announcement and developer documentation as the factual base, and treats benchmark and customer claims as company-reported unless independently verified.

What Claude Fable 5 is

Claude Fable 5 is the public version of Anthropic's new Mythos-class capability tier. Anthropic describes Mythos-class as above Opus in capability. The same launch also introduced Claude Mythos 5, but Mythos 5 is not generally available. It is reserved for Project Glasswing partners and selected trusted-access users.

The practical distinction is simple:

ModelAccessSafety postureBest fit
Claude Fable 5Generally availableIncludes classifiers and fallbacksAdvanced coding, long-horizon work, knowledge work, vision, enterprise automation
Claude Mythos 5RestrictedSame underlying model with safeguards lifted in some areasVetted cyberdefense and selected research programs
Claude Opus 4.8Generally availableSafer fallback modelComplex work where Fable is unavailable, unnecessary, or blocked by safeguards

For developers, the model ID is claude-fable-5. Anthropic's docs list a 1M-token context window, up to 128k output tokens, always-on adaptive thinking, vision support, task budgets, compaction, the memory tool, and context-management features. Pricing is $10 per million input tokens and $50 per million output tokens.

Availability is broad at launch: Claude API, Claude Platform on AWS, Amazon Bedrock, Vertex AI, and Microsoft Foundry. GitHub also says Fable 5 is rolling into GitHub Copilot across VS Code, Visual Studio, Copilot CLI, cloud agent, JetBrains, Xcode, Eclipse, GitHub.com, and mobile surfaces for eligible plans. AWS has also announced Fable 5 availability on Amazon Bedrock and Claude Platform on AWS.

The benchmark picture

Anthropic's benchmark table shows Fable 5 leading most categories against Claude Mythos Preview, Claude Opus 4.8, GPT-5.5, and Gemini 3.1 Pro.

Claude Fable 5 benchmark table

The headline scores from the attached benchmark image:

Benchmark areaBenchmarkClaude Fable 5 / Mythos 5Closest listed competitor
Agentic codingSWE-Bench Pro80.3%Claude Mythos Preview at 77.8%
Agentic codingFrontierCode Diamond29.3%Claude Opus 4.8 at 13.4%
Knowledge workGDPval-AA1932Claude Opus 4.8 at 1890
Knowledge work visionGDP.pdf, no tools29.8%GPT-5.5 at 24.9%
Spatial reasoningBlueprint-Bench 238.6%GPT-5.5 at 36.2%
Tool useAutomationBench17.4%Claude Opus 4.8 at 15.5%
Computer useOSWorld-Verified85.0%Claude Mythos Preview at 85.4%
LegalLegal Agent Benchmark13.3%Claude Opus 4.8 at 10.4%
Multidisciplinary reasoningHumanity's Last Exam, no tools59.0%Claude Mythos Preview at 56.8%
Multidisciplinary reasoningHumanity's Last Exam, with tools64.5%Claude Mythos Preview at 64.7%
BiologyBioMysteryBench, hard46.1%Claude Opus 4.8 at 40.0%
BiologyBioMysteryBench, human solved83.9%Claude Mythos Preview at 82.6%
Agentic codingTerminal-Bench 2.188.0%GPT-5.5 Codex CLI at 83.4%
CybersecurityExploitBench, capture percentage78.0%Claude Mythos Preview at 69.0%
HealthHealthBench Professional66.0%Claude Mythos Preview at 64.7%

The pattern is clear: Fable 5's biggest advantage appears in long-horizon agentic work, difficult coding, cyber evaluation, and research-style tasks. It does not win every row. Mythos Preview is slightly ahead on OSWorld-Verified and tool-enabled Humanity's Last Exam. But Fable 5's average position is strong enough that the launch changes the top of the public model stack.

The methodology note matters. The attached image says Fable 5 and Mythos 5 scores are usually within 1 to 3 percentage points of each other, but starred benchmarks can differ more because Fable 5 has blocking safeguards for cyber and biology-related questions. In those cases, Fable may behave closer to Opus 4.8 because requests can fall back.

That means buyers should not read the table as "Fable is always Mythos." It is better read as: Fable is Mythos-class capability for normal use, with policy gates that intentionally reduce capability in risky domains.

Why the benchmarks matter

The strongest signal is not one score. It is the mix of score types.

SWE-Bench Pro and Terminal-Bench 2.1 point to coding agents that can operate across real repositories and terminal workflows. FrontierCode Diamond suggests performance on production-quality coding tasks, not just small algorithm puzzles. GDPval-AA, GDP.pdf, and Humanity's Last Exam point to knowledge work that requires document reasoning, visual interpretation, and tool-aware synthesis.

That combination matters because the market is moving from chatbots to agents. A frontier model is no longer judged only by whether it can answer questions. It is judged by whether it can sustain a task, use tools, recover from failure, and keep a coherent plan over a long context.

Anthropic is explicitly positioning Fable 5 for that kind of work. The company says the model's lead grows as tasks become longer and more complex. Early customer claims in the launch post point to codebase migrations, finance analysis, spreadsheet work, legal redlining, physics research, and app prototyping.

Those are exactly the workflows where small differences in planning quality compound. A model that is slightly better at every step can become much better over a 40-step task.

The safety layer is part of the product

The core tradeoff in Fable 5 is not price. It is capability versus controlled access.

Anthropic says Fable 5 includes classifiers for cybersecurity, biology and chemistry, and distillation. When those classifiers trigger, the request is handled by Claude Opus 4.8 instead of Fable 5. Anthropic says more than 95% of Fable sessions do not involve fallback, but also says the safeguards are tuned conservatively and may catch benign requests.

That creates a new user experience pattern:

  1. For most ordinary work, users get the full Fable 5 experience.
  2. For sensitive domains, the model may transparently fall back to Opus 4.8.
  3. For developers, refusals and fallback behavior must be handled as normal API paths, not edge-case errors.

The API docs also note a 30-day data-retention requirement for Fable 5 and Mythos 5 traffic. Anthropic says this retention supports safety monitoring and jailbreak detection, and that the data is not used to train new Claude models. Still, this is a real procurement issue. Some customers that require zero data retention will need to review whether Fable 5 fits their data policy.

What changed for builders

For developers, the immediate question is not "Should I replace every model with Fable 5?" The better question is where Fable 5's incremental capability justifies its higher cost and policy requirements.

Good candidates:

  • Large codebase refactors
  • Agentic coding in existing repositories
  • Multi-file debugging and migration work
  • Long document analysis
  • Finance, legal, and research workflows with dense context
  • Vision-heavy document and screenshot understanding
  • High-value workflows where fewer turns are worth more than cheaper tokens

Weak candidates:

  • Simple classification
  • Routine summarization
  • Low-risk support macros
  • Short extraction tasks
  • High-volume workloads where Haiku, Sonnet, or cheaper models already meet quality targets
  • Workloads that require zero data retention

Fable 5 is expensive relative to Opus 4.8, and much more expensive than smaller models. But if it completes a task in fewer attempts, uses fewer retries, and reduces human correction, the unit economics may still work. Token price alone is the wrong metric. Cost per completed workflow is the metric that matters.

What changed for enterprises

Enterprise buyers now have to evaluate frontier models on four dimensions, not one:

DimensionQuestion to ask
CapabilityDoes it beat the current model on our actual workflow, not just public benchmarks?
PolicyWhat requests are blocked, refused, or routed to fallback?
DataIs 30-day retention acceptable for this use case?
EconomicsDoes the model reduce total cost per resolved task despite higher token prices?

This launch also pressures AI governance teams. Fable 5 is powerful enough that "we allow Claude" is no longer a precise policy. Companies will need model-level permissions. Some teams may allow Sonnet and Opus for broad use, but restrict Fable 5 to approved workflows because of data retention, cost, or sensitive-domain behavior.

That is not a weakness. It is how serious AI deployment should work. More capable models need clearer operating boundaries.

The competitive read

The benchmark table places Fable 5 ahead of GPT-5.5 and Gemini 3.1 Pro in many of the attached categories. The bigger competitive move is distribution.

Anthropic is not only launching through Claude. Fable 5 is showing up in the Claude API, AWS, Bedrock, Vertex AI, Microsoft Foundry, GitHub Copilot, and vertical products like Harvey. That matters because frontier model competition is increasingly decided by where the model can be used safely inside existing workflows.

GitHub Copilot availability gives Fable 5 a direct path into developer work. AWS and Bedrock availability give enterprise cloud teams a procurement and governance path. Harvey's announcement matters for legal workflows because it shows vertical AI products are already trying to turn Fable 5 into domain-specific execution.

The model race is now a distribution race, a policy race, and an evaluation race.

Bottom line

Claude Fable 5 is Anthropic's strongest public model and the first broad release of Mythos-class capability. The benchmark table shows clear leadership in agentic coding, long-context knowledge work, vision-heavy reasoning, legal tasks, cyber evaluation, biology, and health benchmarks.

But the release should not be reduced to "better benchmark scores." The important shift is that Anthropic is packaging a restricted frontier capability into a public product by adding classifiers, fallback behavior, usage-policy controls, and 30-day safety retention.

For builders, Fable 5 is worth testing on tasks where long-horizon reliability matters more than raw token cost. For enterprises, it should be evaluated as a high-capability model with explicit governance requirements. For the market, it signals that the next phase of AI competition will be about controlled deployment of models that are powerful enough to need real access design.

The practical advice: benchmark it on your hardest workflow, measure completed-task cost, inspect fallback behavior, and check data-retention fit before rolling it out broadly.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn