🧭 Claude Fable 5 & Mythos 5: Anthropic's First Public Mythos-Class Models
Anthropic has launched Claude Fable 5 — the first model in the new Mythos tier, which sits above Opus in capability — together with Claude Mythos 5, a higher-clearance variant available to select government and critical-infrastructure partners. The models were formally unveiled at the Code with Claude Tokyo opening keynote, timed to the start of the conference on June 10. Fable 5 is immediately accessible to all paid subscribers (Pro, Max, Team, Enterprise) at no extra charge through June 22, after which it becomes an explicitly billed model at its standard API rate.
Benchmark highlights
- SWE-Bench Pro: 80.3% — versus Opus 4.8 at 69.2% and GPT-5.5 at 58.6%
- Hex analytical benchmark: 91.4% — the first model to exceed 90% on this test
- Long-horizon task cohesion: Fable 5 maintains consistent intent across multi-million-token autonomous runs — Anthropic describes this as the model's most practically significant capability jump over Opus 4.8
- One early design partner reported a frontier physics research task completed in 36 hours using one-third the reasoning tokens it took GPT-5.5 four days to approximate; at token-efficiency parity, effective cost was lower despite the higher per-token rate
Pricing and model IDs
- API model ID:
claude-fable-5 (and claude-mythos-5 for Glasswing partners)
- $10 / $50 per million input / output tokens — exactly 2× Opus 4.8 rates
- 90% prompt cache discount applies (same as Opus 4.8), making long-context repeated-prefix workloads far more economical
- Previous Mythos Preview pricing was ~$30/$150; Fable 5 is less than half that cost
Safeguards and the Mythos 5 split
Fable 5 ships with conservative output guardrails that trigger on fewer than 5% of sessions on average; affected queries silently fall back to Claude Opus 4.8. Claude Mythos 5 uses the same underlying weights but with safeguards partially lifted across specific domains — it is initially deployed exclusively through Project Glasswing in collaboration with the US government for cyber-defence and infrastructure work.
Is Fable 5 an alias or a new model family?
"Fable" and "Mythos" are distinct capability tiers, not marketing aliases for existing models. Fable 5 represents the first generalised-public release of a Mythos-class system; Mythos 5 is the restricted-access counterpart. Anthropic has stated that future Mythos-class releases will follow this dual-track pattern: a public Fable variant and a higher-capability Mythos variant gated behind safety agreements.
Claude Fable 5
Claude Mythos 5
Mythos-class
SWE-Bench Pro
long-horizon autonomy
API pricing
Project Glasswing
🧭 Code with Claude Tokyo Day 1: F1 Agents, Long-Horizon Demo & Keynote Highlights
Day 1 of Code with Claude Tokyo opened at 09:15 JST with a Dario Amodei keynote that foregrounded Fable 5's long-horizon autonomy on a live audience stage first. The most-shared moment: an interactive F1 race-strategy simulation in which four parallel specialist Claude agents — analysing aerodynamics, tyre temperature, power-unit telemetry, and driver safety margins — coordinated through a central grading agent to produce a real-time pit-stop recommendation under simulated race pressure. The agents ran for approximately 22 minutes of wall-clock time against a multi-million-token shared context window, with no human intervention after the initial task injection.
Session highlights from the three tracks
- Research track — "Mythos-Class Safety at Scale": Anthropic researchers walked through how Constitutional AI and Model Spec Midtraining (MSM) were adapted for Fable 5's additional capability headroom, with specific focus on preventing instrumental convergence in long-horizon runs.
- Platform track — "Scaling Agentic Workflows at Enterprise": Practitioners from LG CNS, Rakuten, and Softbank shared production metrics from their Claude deployments — LG CNS reported a 4× throughput increase on contract review after switching from single-turn to multi-agent orchestration.
- Claude Code track — "Long-Horizon Tasks in Production": Case studies from teams running Claude Code on multi-day, multi-repo migrations; the session covered checkpoint design, context compaction triggers, and how to structure CLAUDE.md for orchestrator/sub-agent role clarity.
- MCP Server Workshop: The 90-minute hands-on session (capped at 80 in-person) walked through deploying a private MCP server behind mTLS; all materials published to
github.com/anthropics/cc-tokyo-2026 during the session.
The orchestration pattern behind the F1 demo
The demo used Claude's parallel tool calls combined with a structured task-dispatch agent: one orchestrator decomposed the race problem into four domain scopes, spawned sub-agents with scoped system prompts, then fanned-in their outputs through a grading agent that scored feasibility and surfaced the highest-confidence recommendation. The same pattern — decompose → dispatch → grade → synthesise — is directly applicable to any high-stakes decision workflow where domain isolation reduces context contamination.
Code with Claude Tokyo
multi-agent
F1 demo
long-horizon
MCP
orchestration
developer conference
🧭 Twelve Days to Evaluate Fable 5 for Free — Here's How
From today until June 22, claude-fable-5 is included on all paid plans (Pro, Max, Team, seat-based Enterprise) at no additional cost. After June 22 it becomes an explicitly billed API model at $10/$50 per million tokens — 2× Opus 4.8 rates. That leaves twelve days to run a structured evaluation and decide whether Fable 5 belongs in your production stack. Here is a practical framework for making that call.
Step 1 — Identify your highest-value workloads
Fable 5's strongest performance gains over Opus 4.8 are in:
- Complex multi-step software engineering (SWE-Bench Pro: 80.3% vs 69.2%)
- Long-context analytical tasks — reports, contracts, and research spanning hundreds of pages where maintaining coherence across the full context matters
- Multi-day autonomous agent runs — tasks that span hours or days without human re-prompting
- Scientific and technical reasoning — especially tasks where token efficiency on reasoning steps matters more than raw per-token cost
Step 2 — Switch your eval environment to claude-fable-5
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-fable-5", # explicit model ID — no alias yet
max_tokens=4096,
messages=[{"role": "user", "content": your_eval_prompt}]
)
# Check for safeguard fallback — indicates a topic triggered the Opus 4.8 backstop
if response.stop_reason == "end_turn" and "claude_opus_4_8" in str(response.model):
print("Safeguard fallback triggered — response from Opus 4.8, not Fable 5")
else:
print(f"Fable 5 response: {response.content[0].text[:200]}")
Step 3 — Compare on your own benchmarks, not Anthropic's
SWE-Bench Pro is a useful signal but not your benchmark. Run your existing eval suite against both claude-opus-4-8 and claude-fable-5, tracking accuracy, token count per task, and latency. If Fable 5 uses significantly fewer tokens per correct completion on your tasks, the effective cost may already be at or below Opus 4.8 parity — even at 2× the listed rate.
Step 4 — Audit safeguard fallback rate in your domain
The published 5% average fallback rate varies considerably by domain. Content dealing with cybersecurity, chemistry, biology, or geopolitics may trigger fallbacks more frequently. If your use case falls in those areas, measure your specific fallback rate during the free window; a high rate indicates Mythos 5 (if you qualify for Glasswing access) or Opus 4.8 may be the correct choice.
Prompt caching makes Fable 5 competitive on long-context tasks
The 90% cache discount applies to claude-fable-5 just as it does to Opus 4.8. For workloads with a large, stable system prompt or document prefix (e.g., feeding a 200-page contract once and asking multiple questions), the cached-token cost drops to $1 per million input tokens — identical to Opus 4.8's cached rate. At that cost level, Fable 5's accuracy advantage is essentially free for repeated-prefix workloads.
Claude Fable 5
evaluation
API migration
prompt caching
June 22 deadline
cost optimisation
benchmarks