Claude Opus 4.7 geïntroduceerd

Our latest model, Claude Opus 4.7, is now generally available.

Opus 4.7 is a notable improvement on Opus 4.6 in advanced software engineering, with particular gains on the most difficult tasks. Users report being able to hand off their hardest coding work—the kind that previously needed close supervision—to Opus 4.7 with confidence. Opus 4.7 handles complex, long-running tasks with rigor and consistency, pays precise attention to instructions, and devises ways to verify its own outputs before reporting back.

The model also has substantially better vision: it can see images in greater resolution. It’s more tasteful and creative when completing professional tasks, producing higher-quality interfaces, slides, and docs. And—although it is less broadly capable than our most powerful model, Claude Mythos Preview—it shows better results than Opus 4.6 across a range of benchmarks:

Last week we announced Project Glasswing, highlighting the risks—and benefits—of AI models for cybersecurity. We stated that we would keep Claude Mythos Preview’s release limited and test new cyber safeguards on less capable models first. Opus 4.7 is the first such model: its cyber capabilities are not as advanced as those of Mythos Preview (indeed, during its training we experimented with efforts to differentially reduce these capabilities). We are releasing Opus 4.7 with safeguards that automatically detect and block requests that indicate prohibited or high-risk cybersecurity uses. What we learn from the real-world deployment of these safeguards will help us work towards our eventual goal of a broad release of Mythos-class models.

Security professionals who wish to use Opus 4.7 for legitimate cybersecurity purposes (such as vulnerability research, penetration testing, and red-teaming) are invited to join our new Cyber Verification Program.

Opus 4.7 is available today across all Claude products and our API, Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry. Pricing remains the same as Opus 4.6: $5 per million input tokens and $25 per million output tokens. Developers can use claude-opus-4-7 via the Claude API.

Testing Claude Opus 4.7

Claude Opus 4.7 has garnered strong feedback from our early-access testers:

Below are some highlights and notes from our early testing of Opus 4.7:

Instruction following. Opus 4.7 is substantially better at following instructions. Interestingly, this means that prompts written for earlier models can sometimes now produce unexpected results: where previous models interpreted instructions loosely or skipped parts entirely, Opus 4.7 takes the instructions literally. Users should re-tune their prompts and harnesses accordingly.

Improved multimodal support. Opus 4.7 has better vision for high-resolution images: it can accept images up to 2,576 pixels on the long edge (~3.75 megapixels), more than three times as many as prior Claude models. This opens up a wealth of multimodal uses that depend on fine visual detail: computer-use agents reading dense screenshots, data extractions from complex diagrams, and work that needs pixel-perfect references.1

Real-world work. As well as its state-of-the-art score on the Finance Agent evaluation (see table above), our internal testing showed Opus 4.7 to be a more effective finance analyst than Opus 4.6, producing rigorous analyses and models, more professional presentations, and tighter integration across tasks. Opus 4.7 is also state-of-the-art on GDPval-AA, a third-party evaluation of economically valuable knowledge work across finance, legal, and other domains.

Memory. Opus 4.7 is better at using file system-based memory. It remembers important notes across long, multi-session work, and uses them to move on to new tasks that, as a result, need less up-front context.

Improved multimodal support. Opus 4.7 has better vision for high-resolution images: it can accept images up to 2,576 pixels on the long edge (~3.75 megapixels), more than three times as many as prior Claude models. This opens up a wealth of multimodal uses that depend on fine visual detail: computer-use agents reading dense screenshots, data extractions from complex diagrams, and work that needs pixel-perfect references.1

Memory. Opus 4.7 is better at using file system-based memory. It remembers important notes across long, multi-session work, and uses them to move on to new tasks that, as a result, need less up-front context.

The charts below display more evaluation results from our pre-release testing, across a range of different domains:

More effort control: Opus 4.7 introduces a new xhigh (“extra high”) effort level between high and max, giving users finer control over the tradeoff between reasoning and latency on hard problems. In Claude Code, we’ve raised the default effort level to xhigh for all plans. When testing Opus 4.7 for coding and agentic use cases, we recommend starting with high or xhigh effort.

On the Claude Platform (API): as well as support for higher-resolution images, we’re also launching task budgets in public beta, giving developers a way to guide Claude’s token spend so it can prioritize work across longer runs.

In Claude Code: The new /ultrareview slash command produces a dedicated review session that reads through changes and flags bugs and design issues that a careful reviewer would catch. We’re giving Pro and Max Claude Code users three free ultrareviews to try it out. In addition, we’ve extended auto mode to Max users. Auto mode is a new permissions option where Claude makes decisions on your behalf, meaning that you can run longer tasks with fewer interruptions—and with less risk than if you had chosen to skip all permissions.

Safety and alignment

Overall, Opus 4.7 shows a similar safety profile to Opus 4.6: our evaluations show low rates of concerning behavior such as deception, sycophancy, and cooperation with misuse. On some measures, such as honesty and resistance to malicious “prompt injection” attacks, Opus 4.7 is an improvement on Opus 4.6; in others (such as its tendency to give overly detailed harm-reduction advice on controlled substances), Opus 4.7 is modestly weaker. Our alignment assessment concluded that the model is “largely well-aligned and trustworthy, though not fully ideal in its behavior”. Note that Mythos Preview remains the best-aligned model we’ve trained according to our evaluations. Our safety evaluations are discussed in full in the Claude Opus 4.7 System Card.

Also launching today

In addition to Claude Opus 4.7 itself, we’re launching the following updates:

For GPT-5.4 and Gemini 3.1 Pro, we compared against the best reported model version available via API in the charts and table.

MCP-Atlas: The Opus 4.6 score has been updated to reflect revised grading methodology from Scale AI.

SWE-bench Verified, Pro, and Multilingual: Our memorization screens flag a subset of problems in these SWE-bench evals. Excluding any problems that show signs of memorization, Opus 4.7’s margin of improvement over Opus 4.6 holds.

Terminal-Bench 2.0: We used the Terminus-2 harness with thinking disabled. All experiments used 1× guaranteed/3× ceiling resource allocation averaged over five attempts per task.

CyberGym: Opus 4.6’s score has been updated from the originally reported 66.6 to 73.8, as we updated our harness parameters to better elicit cyber capability.

SWE-bench Multimodal: We used an internal implementation for both Opus 4.7 and Opus 4.6. Scores are not directly comparable to public leaderboard scores.

More effort control: Opus 4.7 introduces a new xhigh (“extra high”) effort level between high and max, giving users finer control over the tradeoff between reasoning and latency on hard problems. In Claude Code, we’ve raised the default effort level to xhigh for all plans. When testing Opus 4.7 for coding and agentic use cases, we recommend starting with high or xhigh effort.

In Claude Code: The new /ultrareview slash command produces a dedicated review session that reads through changes and flags bugs and design issues that a careful reviewer would catch. We’re giving Pro and Max Claude Code users three free ultrareviews to try it out. In addition, we’ve extended auto mode to Max users. Auto mode is a new permissions option where Claude makes decisions on your behalf, meaning that you can run longer tasks with fewer interruptions—and with less risk than if you had chosen to skip all permissions.

Introducing Claude Opus 4.7 Claude Opus 4.7 geïntroduceerd

Testing Claude Opus 4.7

Memory. Opus 4.7 is better at using file system-based memory. It remembers important notes across long, multi-session work, and uses them to move on to new tasks that, as a result, need less up-front context.

Safety and alignment

Migrating from Opus 4.6 to Opus 4.7

Watch related videos Bekijk gerelateerde video's

Introducing Claude Opus 4.6 Introductie van Claude Opus 4.6

More from Claude Meer van Claude

Introducing The Anthropic Institute Introductie van The Anthropic Institute

Anthropic invests $100 million into the Claude Partner Network Anthropic investeert 100 miljoen dollar in Claude Partner Network

Australian government and Anthropic sign MOU for AI safety and research Australische regering en Anthropic tekenen MOU voor AI-veiligheid en onderzoek

Anthropic expands partnership with Google and Broadcom for multiple gigawatts of next-generation compute Anthropic breidt samenwerking met Google en Broadcom uit voor meerdere gigawatts aan next-gen compute

Introducing Claude Opus 4.7 Claude Opus 4.7 geïntroduceerd

Testing Claude Opus 4.7

Memory. Opus 4.7 is better at using file system-based memory. It remembers important notes across long, multi-session work, and uses them to move on to new tasks that, as a result, need less up-front context.

Safety and alignment

Migrating from Opus 4.6 to Opus 4.7

Watch related videos Bekijk gerelateerde video's

Introducing Claude Opus 4.6 Introductie van Claude Opus 4.6

More from Claude Meer van Claude

Introducing The Anthropic Institute Introductie van The Anthropic Institute

Anthropic invests $100 million into the Claude Partner Network Anthropic investeert 100 miljoen dollar in Claude Partner Network

Australian government and Anthropic sign MOU for AI safety and research Australische regering en Anthropic tekenen MOU voor AI-veiligheid en onderzoek

Anthropic expands partnership with Google and Broadcom for multiple gigawatts of next-generation compute Anthropic breidt samenwerking met Google en Broadcom uit voor meerdere gigawatts aan next-gen compute

The Next Input keeps optional media off until you say yes. The Next Input houdt optionele media uit tot jij ja zegt.