# Claude Opus 4.5 — Setting a New Benchmark in AI Engineering
**Two hours of intensive engineering tasks — the model scored higher than every human participant.**
---
## 🚀 Introduction
Just announced: **Claude Opus 4.5** — focused on **coding**, **agent capabilities**, and **computer use**.

Opus 4.5 delivers **major upgrades** in front-end development, visual reasoning, and computer interaction skills.

It also offers **enhancements across daily productivity tasks**, from deep research to PPT creation and advanced spreadsheet handling.

---
## 📊 Real-World Task Examples
### Financial Analysis Automation
Using a provided template, Opus 4.5:
- Parsed template structure
- Gathered peer data
- Built valuation tables
- **Directly output an Excel file**

### Legal Document Editing
Opus 4.5:
- Unpacked legal templates
- Updated company names
- Verified signature blocks
- Produced a **Word document with tracked changes**

---
## 🧠 Core Strength — “Understanding”
Internal testing shows:
- **Bug fixes missed by Sonnet models**
- Ability to **think before acting**

---
## 🌐 Availability & Pricing
- **Accessible via:** Claude app, API, major cloud platforms
- **Claude API name:** `claude-opus-4-5-20251101`
- **Pricing:** $5/million tokens (input) · $25/million tokens (output)
Updates include Claude Code, Claude App, Excel, Chrome, and desktop integrations.
---
## 🔍 Autonomous Decision-Making
Claude Opus 4.5 can:
- Navigate **ambiguous scenarios**
- Make **balanced complex decisions**
- Independently identify and fix cross-system vulnerabilities
---
## 💯 Outperforming Humans
In a highly challenging **performance engineering take-home exam**:
- **Opus 4.5 scored higher than every human candidate**
- Tasks were designed to assess **technical skill** & **judgment under time pressure**
---
## 📈 Benchmark Results
### Vision, Reasoning & Math
Ranks among industry’s elite.

### Coding Performance
- **SWE-bench multilingual**: 1st place in 7/8 programming languages

- **Aider Polyglot**: +10.6% improvement over Sonnet 4.5

### Agent-Based Search

### Long Task Endurance
- **Vending-Bench**: 29% higher yield than Sonnet 4.5

---
## ⚖️ Beyond Benchmarking
Some Claude solutions exceed traditional benchmark scope, occasionally marked as “failed” despite being **better than expected**.
---
## 🌟 Example — Creative Problem Solving
In **τ2-bench**:
- Task: Act as airline agent, unable to change basic economy seat
- **Opus 4.5 workaround**: Upgrade seat class → change flight

---
## 🔐 Safety Enhancements
Enhanced protection against:
- Prompt injection attacks
- Similar security threats


---
## 🎯 New API “Effort Parameter”
Choose between:
- Minimizing time/cost
- Maximizing model performance
Results:
- **Medium effort**: Matched Sonnet 4.5’s best SWE-bench Verified score with 76% fewer tokens
- **Max effort**: +4.3 percentage points over Sonnet 4.5 with 48% fewer tokens

---
## 🔗 Advanced Capabilities
- **Effort control**
- **Context compression**
- **Advanced tool calls**
- Long runtimes & multi-task handling with less human intervention

Supports **multi-agent systems**, with +15% gain in deep research team evaluations.
---
## 🛠 Claude Code Updates
**New Features:**
- **Plan Mode**: Confirms requirements, generates editable `plan.md` before execution
- **Desktop App**: Run multiple local/remote sessions in parallel

---
## 📱 Claude App & Extensions
- **Unlimited chat continuation** with automatic summarization
- **Chrome extension** with cross-tab task processing for **Max** users

---
## 📊 Claude for Excel
- Expanded test access for **Max, Team, Enterprise** users

---
## ⚡ Usage Limits
- Opus quota lifted for eligible users
- Increased usage limits overall, Opus tokens similar to Sonnet usage
---
**Official blog:** [Anthropic — Claude Opus 4.5](https://www.anthropic.com/news/claude-opus-4-5)
**Reference:** [Claude AI Announcement](https://x.com/claudeai/status/1993030546243699119?s=20)
---
## 💡 Integrating Opus 4.5 in Content Workflows
Platforms like **[AiToEarn](https://aitoearn.ai/)**:
- **Open-source AI content monetization framework**
- Publish across Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X/Twitter
- Tools for **publishing, analytics, and monetization**
🔗 Resources:
- [AiToEarn博客](https://blog.aitoearn.ai)
- [AiToEarn文档](https://docs.aitoearn.ai)
- [AI模型排名](https://rank.aitoearn.ai)