2026-06-10 🧭 Daily News

Claude Fable 5 Launches, Tokyo Day 1 Keynote & Developer Action Guide

🧭 Claude Fable 5 & Mythos 5: Anthropic's First Public Mythos-Class Models

Anthropic has launched Claude Fable 5 — the first model in the new Mythos tier, which sits above Opus in capability — together with Claude Mythos 5, a higher-clearance variant available to select government and critical-infrastructure partners. The models were formally unveiled at the Code with Claude Tokyo opening keynote, timed to the start of the conference on June 10. Fable 5 is immediately accessible to all paid subscribers (Pro, Max, Team, Enterprise) at no extra charge through June 22, after which it becomes an explicitly billed model at its standard API rate.

Benchmark highlights

SWE-Bench Pro: 80.3% — versus Opus 4.8 at 69.2% and GPT-5.5 at 58.6%
Hex analytical benchmark: 91.4% — the first model to exceed 90% on this test
Long-horizon task cohesion: Fable 5 maintains consistent intent across multi-million-token autonomous runs — Anthropic describes this as the model's most practically significant capability jump over Opus 4.8
One early design partner reported a frontier physics research task completed in 36 hours using one-third the reasoning tokens it took GPT-5.5 four days to approximate; at token-efficiency parity, effective cost was lower despite the higher per-token rate

Pricing and model IDs

API model ID: claude-fable-5 (and claude-mythos-5 for Glasswing partners)
$10 / $50 per million input / output tokens — exactly 2× Opus 4.8 rates
90% prompt cache discount applies (same as Opus 4.8), making long-context repeated-prefix workloads far more economical
Previous Mythos Preview pricing was ~$30/$150; Fable 5 is less than half that cost

Safeguards and the Mythos 5 split

Fable 5 ships with conservative output guardrails that trigger on fewer than 5% of sessions on average; affected queries silently fall back to Claude Opus 4.8. Claude Mythos 5 uses the same underlying weights but with safeguards partially lifted across specific domains — it is initially deployed exclusively through Project Glasswing in collaboration with the US government for cyber-defence and infrastructure work.

Is Fable 5 an alias or a new model family?

"Fable" and "Mythos" are distinct capability tiers, not marketing aliases for existing models. Fable 5 represents the first generalised-public release of a Mythos-class system; Mythos 5 is the restricted-access counterpart. Anthropic has stated that future Mythos-class releases will follow this dual-track pattern: a public Fable variant and a higher-capability Mythos variant gated behind safety agreements.

⭐⭐⭐ anthropic.com

🧭 Code with Claude Tokyo Day 1: F1 Agents, Long-Horizon Demo & Keynote Highlights

Day 1 of Code with Claude Tokyo opened at 09:15 JST with a Dario Amodei keynote that foregrounded Fable 5's long-horizon autonomy on a live audience stage first. The most-shared moment: an interactive F1 race-strategy simulation in which four parallel specialist Claude agents — analysing aerodynamics, tyre temperature, power-unit telemetry, and driver safety margins — coordinated through a central grading agent to produce a real-time pit-stop recommendation under simulated race pressure. The agents ran for approximately 22 minutes of wall-clock time against a multi-million-token shared context window, with no human intervention after the initial task injection.

Session highlights from the three tracks

Research track — "Mythos-Class Safety at Scale": Anthropic researchers walked through how Constitutional AI and Model Spec Midtraining (MSM) were adapted for Fable 5's additional capability headroom, with specific focus on preventing instrumental convergence in long-horizon runs.
Platform track — "Scaling Agentic Workflows at Enterprise": Practitioners from LG CNS, Rakuten, and Softbank shared production metrics from their Claude deployments — LG CNS reported a 4× throughput increase on contract review after switching from single-turn to multi-agent orchestration.
Claude Code track — "Long-Horizon Tasks in Production": Case studies from teams running Claude Code on multi-day, multi-repo migrations; the session covered checkpoint design, context compaction triggers, and how to structure CLAUDE.md for orchestrator/sub-agent role clarity.
MCP Server Workshop: The 90-minute hands-on session (capped at 80 in-person) walked through deploying a private MCP server behind mTLS; all materials published to github.com/anthropics/cc-tokyo-2026 during the session.

The orchestration pattern behind the F1 demo

The demo used Claude's parallel tool calls combined with a structured task-dispatch agent: one orchestrator decomposed the race problem into four domain scopes, spawned sub-agents with scoped system prompts, then fanned-in their outputs through a grading agent that scored feasibility and surfaced the highest-confidence recommendation. The same pattern — decompose → dispatch → grade → synthesise — is directly applicable to any high-stakes decision workflow where domain isolation reduces context contamination.

⭐⭐⭐ claude.com

🧭 Twelve Days to Evaluate Fable 5 for Free — Here's How

From today until June 22, claude-fable-5 is included on all paid plans (Pro, Max, Team, seat-based Enterprise) at no additional cost. After June 22 it becomes an explicitly billed API model at $10/$50 per million tokens — 2× Opus 4.8 rates. That leaves twelve days to run a structured evaluation and decide whether Fable 5 belongs in your production stack. Here is a practical framework for making that call.

Step 1 — Identify your highest-value workloads

Fable 5's strongest performance gains over Opus 4.8 are in:

Complex multi-step software engineering (SWE-Bench Pro: 80.3% vs 69.2%)
Long-context analytical tasks — reports, contracts, and research spanning hundreds of pages where maintaining coherence across the full context matters
Multi-day autonomous agent runs — tasks that span hours or days without human re-prompting
Scientific and technical reasoning — especially tasks where token efficiency on reasoning steps matters more than raw per-token cost

Step 2 — Switch your eval environment to `claude-fable-5`

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-fable-5",   # explicit model ID — no alias yet
    max_tokens=4096,
    messages=[{"role": "user", "content": your_eval_prompt}]
)

# Check for safeguard fallback — indicates a topic triggered the Opus 4.8 backstop
if response.stop_reason == "end_turn" and "claude_opus_4_8" in str(response.model):
    print("Safeguard fallback triggered — response from Opus 4.8, not Fable 5")
else:
    print(f"Fable 5 response: {response.content[0].text[:200]}")

Step 3 — Compare on your own benchmarks, not Anthropic's

SWE-Bench Pro is a useful signal but not your benchmark. Run your existing eval suite against both claude-opus-4-8 and claude-fable-5, tracking accuracy, token count per task, and latency. If Fable 5 uses significantly fewer tokens per correct completion on your tasks, the effective cost may already be at or below Opus 4.8 parity — even at 2× the listed rate.

Step 4 — Audit safeguard fallback rate in your domain

The published 5% average fallback rate varies considerably by domain. Content dealing with cybersecurity, chemistry, biology, or geopolitics may trigger fallbacks more frequently. If your use case falls in those areas, measure your specific fallback rate during the free window; a high rate indicates Mythos 5 (if you qualify for Glasswing access) or Opus 4.8 may be the correct choice.

Prompt caching makes Fable 5 competitive on long-context tasks

The 90% cache discount applies to claude-fable-5 just as it does to Opus 4.8. For workloads with a large, stable system prompt or document prefix (e.g., feeding a 200-page contract once and asking multiple questions), the cached-token cost drops to $1 per million input tokens — identical to Opus 4.8's cached rate. At that cost level, Fable 5's accuracy advantage is essentially free for repeated-prefix workloads.