🧭 Claude Opus 4.8 Launches: 69.2% on SWE-Bench Pro, 2.5× Fast Mode, and a Measurable Honesty Leap
Anthropic today released Claude Opus 4.8, the new flagship in the Claude 4 family. The release is headlined by a jump to 69.2% on SWE-Bench Pro — outperforming GPT-5.5 and Gemini 3.1 Pro on that benchmark — and by a structural improvement to honesty: Opus 4.8 is four times less likely than its predecessor to let code bugs pass without flagging them, and significantly reduces unsupported or confabulated claims. Pricing drops to roughly one-third of the prior flagship cost.
Benchmark snapshot
- SWE-Bench Pro: 69.2% (up from ~61% for Opus 4.7)
- Code generation: outperforms GPT-5.5 and Gemini 3.1 Pro on HumanEval and LiveCodeBench
- Honesty: 4× reduction in "silent code-flaw acceptance" vs. Opus 4.7; lower hallucination rate on factual Q&A
- Cost: ~3× cheaper per million tokens than Opus 4.7
Fast Mode — 2.5× throughput
A new Fast Mode runs Opus 4.8 at 2.5× the token-per-second rate of standard mode at no additional cost, using a speculative-decoding architecture. Fast Mode is opt-in per request via the API parameter "speed": "fast". It trades a small amount of accuracy on complex multi-step reasoning tasks for dramatically lower latency on coding, summarisation, and extraction workloads — making it suitable for interactive products where sub-second response starts matter. Anthropic's internal evals show a <2% regression on MMLU-Pro in Fast Mode; on HumanEval the gap is negligible.
# Fast Mode via the Messages API
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-8-20260529",
max_tokens=1024,
speed="fast", # new parameter — omit for standard mode
messages=[{"role": "user", "content": "Review this PR diff for bugs."}]
)
Parallel subagent orchestration
Opus 4.8 ships with expanded dynamic workflow support in the Claude API: a single Opus 4.8 orchestrator can now spawn and coordinate up to 500 parallel subagents (up from 64 in 4.7). Each subagent inherits the parent's tool set by default but can be given a scoped subset. This unlocks large-scale refactor pipelines, parallel test-suite analysis, and competitive research workloads that previously required custom orchestration layers. Pricing for subagents is the same per-token rate as direct API calls.
Honesty improvements — what changed
The four-times reduction in silent code-flaw acceptance comes from a new critique-before-complete training objective: the model is rewarded for flagging issues in code it is asked to extend before it writes the extension. In practice this means Opus 4.8 will proactively surface bugs, type errors, and logic issues even when the user's prompt didn't ask for a review. If you are building automated code-review pipelines, set explicit instructions to suppress this behaviour in passes where you only want generation — otherwise expect more inline comments than prior versions.
Migration note for claude-opus-4-7 users
Model ID: claude-opus-4-8-20260529. The 4.7 model remains available. Key behavioural change to test: Opus 4.8 is more proactively critical of code it is asked to complete. If your prompts suppress review comments with explicit instructions like "just write the code, no commentary", those instructions remain respected. Otherwise, expect more flagged issues in responses — which is the intended behaviour.
Claude Opus 4.8
model launch
SWE-Bench
Fast Mode
honesty
parallel agents
benchmarks
🧭 Claude Platform on AWS Goes GA: Native API Access with IAM Auth, CloudTrail, and Same-Day Feature Parity
Alongside the Opus 4.8 announcement, Anthropic declared Claude Platform on AWS generally available. This is a distinct offering from Claude on Amazon Bedrock: Anthropic operates the service directly within an AWS-integrated endpoint, meaning enterprises get the full native Claude API — managed agents, web search, MCP connectors, Skills, prompt caching, batch processing, and today's Opus 4.8 Fast Mode — while authenticating via AWS IAM and paying through consolidated AWS billing.
Why it's different from Bedrock
Claude on Bedrock routes API calls through Amazon's model-serving infrastructure; Anthropic has less control over rollout timing, which historically caused a 2–6 week lag between a new Claude feature going live on api.anthropic.com and appearing on Bedrock. Claude Platform on AWS eliminates that lag: features ship to both endpoints on the same day. The trade-off is that data processed through Claude Platform on AWS flows through Anthropic's infrastructure (not Amazon's), which matters for customers with strict AWS-only data residency requirements — those users should stay on Bedrock.
Authentication and audit
Claude Platform on AWS uses AWS SigV4 request signing — the same mechanism you use for any AWS service. Your IAM roles, SCPs, and permission boundaries apply directly, so access control to Claude is managed in the same console as your S3 buckets and Lambda functions. Every API call is logged in CloudTrail with full request metadata, enabling cost attribution by team, compliance auditing, and anomaly detection via CloudWatch or your SIEM.
# Using boto3 to call Claude Platform on AWS
import boto3
from botocore.auth import SigV4Auth
from botocore.awsrequest import AWSRequest
import requests, json
session = boto3.Session()
credentials = session.get_credentials()
payload = {
"model": "claude-opus-4-8-20260529",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Summarise this quarter's KPIs."}]
}
# Endpoint is per-region; replace us-east-1 as needed
endpoint = "https://claude-platform.us-east-1.amazonaws.com/v1/messages"
req = AWSRequest(method="POST", url=endpoint,
data=json.dumps(payload),
headers={"Content-Type": "application/json"})
SigV4Auth(credentials, "claude-platform", "us-east-1").add_auth(req)
response = requests.request(req.method, req.url,
headers=dict(req.headers), data=req.body)
print(response.json()["content"][0]["text"])
Availability and regions
GA launches in us-east-1, us-west-2, eu-west-1, and ap-northeast-1. AWS GovCloud (US) regions enter public preview today and are expected to reach GA within 60 days. All Claude 4 models (Haiku 4.5, Sonnet 4.6, Opus 4.8) and Claude 3.7 Sonnet are available at launch. AWS Marketplace listing enables one-click procurement for organisations that standardise on Marketplace for vendor management and compliance.
Bedrock vs. Claude Platform on AWS — decision guide
Choose Bedrock if: (a) you have AWS-only data residency requirements, (b) you need multi-model comparisons from a single endpoint (Bedrock hosts many providers), or (c) your security policy prohibits non-Amazon infrastructure even within an AWS-authenticated service. Choose Claude Platform on AWS if: (a) you need day-one access to every Claude feature including MCP connectors and managed agents, (b) you want IAM-native auth without managing Anthropic API keys, or (c) you want a single AWS bill line item for Claude usage with CloudTrail audit coverage.
AWS
Claude Platform
IAM
CloudTrail
enterprise
Bedrock
feature parity