Claude Opus 4.7: Everything That's New, Improved, and Worth Knowing

Anthropic's previous model, Claude Opus 4.6, had a rough few weeks before its successor arrived. An AMD senior director publicly described it as having "regressed to the point it cannot be trusted to perform complex engineering," and that post gathered significant attention. Anthropic denied deliberate changes but said it would investigate. On 16 April 2026, Claude Opus 4.7 shipped — and the benchmark numbers suggest the regression complaints have been answered.

This is a full breakdown of what changed, what's new, and what developers need to know before switching.

What Is Claude Opus 4.7?

Claude Opus 4.7 is Anthropic's most capable publicly available AI model as of April 2026. The model ID is claude-opus-4-7 and it is the direct successor to Claude Opus 4.6, which launched in February 2026. It supports a 1 million token context window, 128,000 maximum output tokens, and adaptive thinking — the same structural specs as its predecessor.

One clarification worth making up front: Opus 4.7 is not Anthropic's most powerful model overall. That title belongs to Claude Mythos Preview, a restricted system that Anthropic is only making available to a small group of cybersecurity organisations under its Project Glasswing initiative. Opus 4.7 is the strongest model that anyone can actually use.

Pricing is unchanged from Opus 4.6: $5 per million input tokens and $25 per million output tokens. It is available across claude.ai (Pro, Max, Team, and Enterprise plans), the Anthropic API, Amazon Bedrock, Google Cloud Vertex AI, Microsoft Foundry, Claude Code, Cursor, and GitHub Copilot.

Benchmark Comparisons: Opus 4.7 vs Opus 4.6

The headline numbers are strong. On the benchmarks most relevant to the kind of work Opus is actually used for — complex coding tasks and long-horizon agentic workflows — the gains are some of the largest seen between consecutive Opus releases.

Coding benchmarks

SWE-bench Verified: 87.6% (up from 80.8% on Opus 4.6) — a 6.8-point gain
SWE-bench Pro: 64.3% (up from 53.4% on Opus 4.6) — a 10.9-point gain
CursorBench: 70% (up from 58% on Opus 4.6) — a 12-point gain

To put those numbers in context: the jump from Opus 4.5 to Opus 4.6 was roughly 5 points on SWE-bench Verified. The 4.6-to-4.7 jump nearly doubles that. GitHub Copilot's launch notes describe the model as delivering "stronger multi-step task performance and more reliable agentic execution" over its predecessor, with "meaningful improvement in long-horizon reasoning and complex, tool-dependent workflows."

Cursor, whose internal benchmark showed the 12-point jump, announced 50% off Opus 4.7 inference on launch day to encourage adoption — a move that signals genuine confidence in the numbers. One coding platform's internal 93-task benchmark recorded a 13% lift in task resolution over Opus 4.6, including four tasks that neither Opus 4.6 nor Sonnet 4.6 could solve at all.

Knowledge work and finance

On GDPval-AA, a benchmark designed to measure economically valuable knowledge work in finance and legal domains, Opus 4.7 achieved state-of-the-art results. One law technology firm recorded a 90.9% score on BigLaw Bench at high effort, with the model correctly distinguishing assignment provisions from change-of-control provisions — a task that has historically tripped up frontier models.

Competitive position

At launch, Opus 4.7 outperformed GPT-5.4 and Gemini 3.1 Pro on agentic coding and computer-use tasks. OpenAI followed up with GPT-5.5 one week later (23 April 2026), and the picture now splits: Opus 4.7 still leads on SWE-bench Pro (64.3% vs 58.6%), while GPT-5.5 pulls ahead on Terminal-Bench 2.0 and OSWorld-Verified, two benchmarks focused on agentic computer-use work. Neither model dominates across the board.

Side-by-side comparison of Opus 4.6 and Opus 4.7 maximum image resolution: 1,568px at 1.15 megapixels versus 2,576px at 3.75 megapixels — a 3.25× increase in visual capacity

Opus 4.7 raises the maximum supported image resolution to 3.75MP — more than three times the visual capacity of Opus 4.6. Higher-resolution images use more tokens; Anthropic recommends downsampling when the extra detail isn't needed.

What's New in Opus 4.7

High-resolution image support

This is the most significant architectural change. Opus 4.7 is the first Claude model with high-resolution image support. Maximum image resolution has increased from 1,568px / 1.15MP to 2,576px / 3.75MP — roughly 3.25 times the visual capacity of previous models.

In practical terms, this matters most for:

Computer use and screenshot understanding — the model's coordinates now map 1-to-1 with actual pixels, eliminating the scale-factor arithmetic that was previously required when pointing or clicking on screen elements
Dense document analysis — smaller text, finer detail in scanned documents, slides, and technical diagrams are now readable
Chart and figure analysis — Anthropic specifically cites improved performance on programmatic tool-calling with image-processing libraries such as PIL, including pixel-level data transcription
Low-level visual tasks — pointing, measuring, counting, and bounding-box detection are all improved

One life sciences firm noted major improvements in reading chemical structures and interpreting complex technical diagrams. Higher resolution does come with a cost: more pixels means more tokens per image. If your application does not require the additional fidelity, Anthropic recommends downsampling images before sending them.

xhigh effort level

Opus 4.7 adds a new effort tier called xhigh, which sits between high and max on the effort parameter. Claude Code defaults to xhigh for many coding workflows after this update. It allocates more compute per turn, producing higher-quality outputs but at higher cost. For developers running long-horizon autonomous tasks, this is the setting that unlocks the performance gains seen in the benchmark numbers.

Task budgets (beta)

Task budgets are a new beta feature that addresses a real problem with long-running agents: cost unpredictability. A task budget gives the model a rough estimate of how many tokens to target for a full agentic loop — including thinking, tool calls, tool results, and final output. The model sees a running countdown and uses it to prioritise work and finish gracefully as the budget is consumed.

Developers who missed this feature in the launch coverage should pay attention to it. For anyone building multi-step autonomous workflows, task budgets are the primary mechanism for preventing runaway token costs without sacrificing task completion. If a budget is set too tightly for a given task, the model may complete it less thoroughly or decline it entirely — worth testing with your specific workloads before deploying.

To use task budgets, set the beta header task-budgets-2026-03-13 and add an output_config block to your API call:

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=128000,
    output_config={
        "effort": "high",
        "task_budget": {"type": "tokens", "total": 128000},
    },
    messages=[
        {"role": "user", "content": "Review the codebase and propose a refactor plan."}
    ],
    betas=["task-budgets-2026-03-13"],
)

/ultrareview in Claude Code

A new slash command for deep multi-agent code review. Paid tiers get a limited number of complimentary runs; beyond that, it is a premium mode. Early reports from engineering teams describe it as useful for the kind of thorough cross-file analysis that normally requires a senior engineer to spend several hours reading the codebase.

Breaking Changes and Behaviour Shifts

This section is important for any developer planning to migrate. Opus 4.7 introduces several changes that will break existing integrations if not accounted for.

Sampling parameters removed

Setting temperature, top_p, or top_k to any non-default value in the Messages API now returns a 400 error. This is a hard breaking change. If your prompting pipeline sets any of these parameters — even to what you consider neutral values — your calls will fail after migrating to Opus 4.7. Audit your integrations before switching.

New tokenizer

Opus 4.7 uses a new tokenizer that processes text using roughly 1 to 1.35 times as many tokens as the previous model, depending on content type. That means your token counts from /v1/messages/count_tokens will return different numbers for the same input, and your effective cost may be higher even though the list price has not changed.

Teams that benchmark on throughput or set tight max_tokens limits should test with real traffic before committing to the upgrade. Anthropic recommends updating max_tokens parameters to give additional headroom.

Thinking content omitted by default

Starting with Opus 4.7, thinking content is omitted from the response by default. Thinking blocks still appear in the response stream, but their thinking field will be empty unless you explicitly opt in. If your product streams reasoning to users, this will appear as a long pause before output begins. To restore visible progress during thinking, set "display": "summarized" in your output config.

Extended thinking budgets removed

The extended thinking budget parameter that existed in Opus 4.6 has been removed. Reasoning depth is now managed through the effort level and task budget system instead.

Cybersecurity Safeguards

Opus 4.7 ships with automatic detection and blocking of requests that indicate prohibited or high-risk cybersecurity uses. This is a direct consequence of Project Glasswing — Anthropic's announcement in early April 2026 that its most powerful model, Claude Mythos Preview, can find and exploit software vulnerabilities at a level that rivals skilled human security researchers.

Anthropic's stated approach is to test new cyber safeguards on less capable models before any broader release of Mythos-class capabilities. Opus 4.7 is the first model to carry these safeguards. The model's own cyber capabilities were reportedly reduced during training compared to Mythos Preview.

For legitimate security professionals — penetration testers, vulnerability researchers, red teamers — Anthropic has launched a Cyber Verification Program. Verified researchers gain access to use the model for offensive security work. For the vast majority of developers and general users, these safeguards are invisible and will not affect normal workflows.

What Users Are Actually Saying

Launch reception split sharply along use-case lines, and it is worth being honest about that.

For agentic coding and structured enterprise work, the response has been strongly positive. Replit's president described it as achieving "the same quality at lower cost" on log analysis, bug-finding, and fix proposals compared to Opus 4.6. One autonomous coding platform described a 14% improvement over Opus 4.6 at fewer tokens with a third of the tool errors, and called it "the first model to pass our implicit-need tests." Devin's CEO said it "works coherently for hours, pushes through hard problems rather than giving up."

For general chat and consumer use, the response was more mixed. A Reddit thread titled "Opus 4.7 is not an upgrade but a serious regression" reached roughly 2,300 upvotes within 48 hours. The common complaints: the model feels more confidently wrong, more literal, and more fragile to prompt wording than Opus 4.6 in default consumer chat settings. One community summary put it clearly: "higher ceiling, higher migration cost."

If you're running agents or coding workflows, 4.7 is a step up. If you're dropping it into a consumer chat workflow with unchanged prompts and default settings, many users prefer 4.6.

GitHub Copilot initially priced Opus 4.7 at a 7.5× premium request multiplier versus other models (as promotional pricing through 30 April 2026), which attracted its own criticism from cost-sensitive users.

Claude Design: The Side Launch

Twenty-four hours after Opus 4.7 shipped, Anthropic launched Claude Design — a research-preview visual creation tool available at claude.ai/design, powered by Opus 4.7. It is currently available to Pro plan users. The timing was deliberate: the higher-resolution vision and improved output quality for interfaces and slides in Opus 4.7 directly underpin Claude Design's capabilities.

Who Should Upgrade and When

Upgrade now if you are: running long-horizon agentic coding tasks, building autonomous multi-step workflows, doing computer use or screenshot-dependent automation, working with dense visual documents in finance or legal, or using Claude Code as a core part of your engineering workflow.

Test carefully before upgrading if you are: relying on temperature, top_p, or top_k parameters (they will cause 400 errors), operating on tight token budgets without testing real traffic against the new tokenizer, streaming thinking content to end users without updating your display config, or running consumer-facing chat applications where the prompt style has not been updated.

Stick with Opus 4.6 for now if you are: primarily using Claude for general conversational tasks where predictability and chat quality are the priority, or if the cost increase from the new tokenizer is material to your business case and you have not yet had time to measure impact.

The official model ID is claude-opus-4-7. Anthropic has published a migration guide at platform.claude.com covering all breaking changes in detail.