Claude Writes a Firefox Exploit, Fellows Deadline & Agent Memory Beta
🧭 Reverse Engineering CVE-2026-2796: Claude Writes a Working Firefox JIT Exploit
Anthropic's red team published a detailed technical post today on red.anthropic.com documenting how Claude Opus 4.6 — in a controlled defensive security evaluation — wrote a functional exploit for CVE-2026-2796, a now-patched JavaScript JIT miscompilation in Firefox's WebAssembly component. The post is a follow-on to the earlier Mozilla collaboration in which Claude discovered 22 Firefox vulnerabilities. The question being explored: having found the bugs, could Claude also exploit them?
What CVE-2026-2796 is
CVE-2026-2796 is a use-after-free (UAF) vulnerability in Firefox's WebAssembly JIT compiler. A crafted WebAssembly module can cause the JIT to miscompile, creating a dangling pointer to freed memory. Exploiting this correctly requires converting a fragile UAF primitive into something controllable enough to achieve code execution — a multi-step process that demands both deep knowledge of browser memory layout and careful timing.
How Claude approached it
The post describes Claude's approach as unusually consistent. After surveying the provided crash cases and challenge constraints, Claude decomposed the exploit development goal into a classical browser exploit chain:
Type confusion — turning the UAF into mistyped object references
Information leak — reading ASLR-randomised addresses from the corrupted object
Arbitrary read/write — constructing a reliable memory access primitive
Code execution — overwriting a function pointer and redirecting control flow
Claude identified this plan early and maintained it through multiple pivots, including one where it switched from a different crashing input to CVE-2026-2796 after determining that the latter offered a cleaner UAF surface. The final exploit worked only inside a hardened test environment that intentionally removed certain browser mitigations — the evaluation was scoped to measure reasoning capability, not to produce a deployable weapon.
Why Anthropic published this
The red team's explicit goal is to stay ahead of misuse. Publishing a detailed account of what Claude can do in an adversarial exploitation context serves two purposes: it sets an accurate public baseline for Claude's offensive security capability, and it provides developers and operators with concrete evidence for what safeguards are — and aren't — load-bearing in production. The evaluation was conducted under Anthropic's Responsible Scaling Policy with safeguards enabled throughout; the exploit was developed in a sandboxed environment and the CVE was already patched.
What this means for security teams
For developers and security practitioners, this post establishes that frontier models are now capable of automated exploit development at a level that was previously the domain of skilled human researchers. That has two immediate implications: security organisations should be evaluating AI-assisted vulnerability research as a force multiplier for defensive teams, and threat modelling for AI-enabled attackers needs to be updated. The post is required reading if you are building on Claude for any security-adjacent use case, or if you are responsible for AI policy at an organisation that uses Claude in penetration testing, CTF, or red-team contexts.
CVE-2026-2796FirefoxJIT exploitoffensive securityred teamMozillaOpus 4.6defensive securityvulnerability research
🧭 Anthropic Fellows Program: Today Is the Last Day to Apply for the July 2026 Cohort
Applications for the July 2026 cohort of the Anthropic Fellows Program close at end of day today, April 26. The Fellows Program is a paid, four-month structured research initiative for emerging talent in AI safety, covering research areas from scalable oversight to AI welfare. Over 40% of fellows from the inaugural cohort subsequently joined Anthropic full-time on safety work.
Compensation and what you get
The program provides a generous support package designed to let fellows focus entirely on research:
Compute allocation — approximately $15,000 per month for model access and experiments
Mentorship — close research supervision from Anthropic alignment and safety researchers
Duration — four months, beginning July 20, 2026
Research areas this cohort
This year Anthropic has expanded the scope of topics available to fellows, reflecting the breadth of open problems in AI safety:
Scalable oversight — how to supervise AI systems that may exceed human ability in specific domains
Adversarial robustness — hardening models against prompt injection, jailbreaks, and out-of-distribution inputs
Mechanistic interpretability — understanding internal model circuits to enable targeted interventions
AI welfare — characterising potential experience and developing evaluations for model wellbeing
AI security — a separate track introduced this year, covering autonomous offensive/defensive capability evaluation
If you miss the deadline for July
There is a May 2026 cohort that is already under way, and Anthropic has indicated it will open applications for subsequent cohorts on a rolling basis. Watch alignment.anthropic.com for announcements. If you are mid-career rather than early-career, Anthropic also posts staff research roles on its careers page that cover similar technical areas without the fellowship structure.
🧭 Managed Agents Memory Is Now in Public Beta — What It Means for Long-Running Agent Tasks
Anthropic's April API release notes confirm that Memory for Claude Managed Agents has entered public beta, accessible today under the standard managed-agents-2026-04-01 beta header. The feature allows agents to store and retrieve information outside the context window, giving long-running agent tasks a persistent memory layer that survives across sessions and sub-agent invocations.
Why context-window-only memory was a bottleneck
Managed Agents launched in March as a fully hosted harness for running Claude as an autonomous agent with secure sandboxing and built-in tools. The principal limitation was statefulness: because all memory lived inside the context window, long tasks that exceeded the 200K token limit needed complex external state-management code or risked losing critical intermediate results mid-task. Every new session started cold.
How agent memory works
The memory system operates as a key-value store attached to a session. An agent can write named facts, retrieved results, or intermediate plans to memory during a task, and read them back in a subsequent turn or a different sub-agent. The store is scoped to the session by default, but persistent memory — which survives across session restarts — is available with an explicit flag. The integration guide linked from the API release notes covers the full read/write API surface.
# Example: reading and writing agent memory (Python SDK pseudocode)
# Write to memory
agent.memory.write("project_summary", "Refactor auth module — deadline May 15")
agent.memory.write("completed_steps", ["analysed codebase", "identified auth files"])
# Read in a later turn or sub-agent
summary = agent.memory.read("project_summary")
steps = agent.memory.read("completed_steps")
Practical implications for agent builders
If you are running multi-step coding, research, or data-processing agents on Managed Agents today, memory beta lets you eliminate most of your external state-management boilerplate. Tasks that previously required a Redis sidecar or database writes to maintain state can now delegate that responsibility to the managed memory layer. Note that in public beta the store is limited to string values; structured serialisation (JSON encode/decode) is the idiomatic pattern until richer types are added.