AI news

Ilya Speaks Out: The "Brute Force" Era of Large Models Is Over

Honghao Wang

26 Nov 2025 — 4 min read

AI’s Shift: From Scaling Back to Research

> “AI is moving from the scaling era back toward the scientific research era.”

> — Ilya Sutskever

This viewpoint comes from Ilya’s recent ~20,000-word interview, touching on nearly every hot topic in AI today:

Why AI still lags behind humans in generalization
Safety and alignment challenges
Limitations within the pre-training paradigm

Ilya believes the mainstream “pre-training + scaling” approach has hit a bottleneck. His call: stop chasing size blindly and focus on rethinking the research paradigm.

Many in the community agree — but for long-time critics like Yann LeCun (“LLMs are dead”), the déjà vu is frustrating.

LeCun even reshared a meme implying:

> “So when I said it, nobody cared?”

---

The Conversation Highlights

Setting the Scene

Ilya reflects on the surreal, sci-fi feel of AI’s boom — and the paradox: explosive investments haven’t yet translated into proportionate tangible change for everyday life.

> Observation: AI feels abstract to the general public because announcements remain “just big numbers” rather than experiences.

---

Benchmark vs Real-World Impact

Models excel in benchmarks yet struggle with real-world reliability.
“Vibecoding” bug example: model alternates between two errors without resolution.
Possible cause: RL training creates over-focused, single-target behavior.

---

From “All the Data” to RL Environments

Early pre-training: use all available data — no need to choose.
RL era: must design specific environments for capabilities.
Risk: over-optimizing for evaluation metrics → reward hacking by researchers, not just models.

---

Generalization Gap

Two main interpretations:

Expand environments — test beyond contests, into building real applications.
Invent ways for skills to transfer across domains — true general capabilities.

---

Competitive Programming Analogy

Student A: 10,000 hours in one niche → top competitor.
Student B: 100 hours + broader experience → better career.
Overtraining in narrow domains hurts generalization — mirrors current AI pitfalls.

---

The Two Big Advantages of Pre-Training

Massive, natural dataset (human activities, world projections into text)
No need to choose subsets — “take it all”

Challenge: Hard to fully understand how models leverage such data; possible key gaps in knowledge representation.

---

Evolutionary Priors & Learning Ability

Certain human skills (e.g., dexterity, vision) deeply rooted in evolution.
Language/programming are recent — yet humans still show sample-efficient learning there.
Points to a general machine learning ability in humans, beyond evolutionary priors.

---

Human-Like Continuous Learning

Teen drivers self-correct without explicit rewards.
Models lack comparable internal value systems.
Achieving this in AI may be possible — but involves ideas not yet openly shareable.

---

Scaling & Recipe Thinking

2012‒2020: Research era
2020‒2025: Scaling era
Next: research again — but now with giant compute.
Pre-training “recipe” proved the scaling law.
Hard limit ahead: finite data → shift to reinforced pre-training, RL, or new paradigms.

---

Efficiency & Value Functions

RL consumes extreme compute; efficiency needs value functions.
Generalization remains the core unsolved problem.

---

Sample Efficiency & Learning Understanding

Humans learn new motor skills far faster than robots/models.
Possible explanations: evolved priors + intrinsic value systems.

---

Deployment, Market Forces, and Safety

Gradual vs Direct Superintelligence

Benefits of market competition avoidance vs public safety learning through deployment.
Continuous rollout allows systems to improve robustness.

---

Economic Impact & Intelligence Explosion

AI that can both learn like humans and scale like computers could drive rapid economic growth.
Deployment speed may vary by national regulation.

---

Strategic Research & Compute Use

SSI focuses compute on research, not inference-heavy product demands.
Differentiated research can be proven without extreme scale.
Key: how compute is used matters more than sheer volume.

---

Diversity in AI Systems

Pretraining yields similar models; RL/fine-tuning drives differentiation.
Self-play useful for negotiation, strategy, adversarial skills — but limited.
Diverse approaches foster robustness.

---

The Role of “Research Taste”

Aesthetic conviction: beauty, simplicity, elegance, brain-inspired features.
Guides persistence when experiments contradict intuition.

---

Practical Application: AiToEarn Platform

In parallel to the theoretical debate, AiToEarn官网 exemplifies how AI capabilities can be deployed today:

Open-source global AI content monetization
Generate, publish, earn across Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X
Integrates AI generation tools, cross-platform distribution, analytics, AI模型排名
Resources: AiToEarn博客 · AiToEarn文档 · GitHub开源

This operational scaling shows how multi-domain adaptability, diversity, and efficiency can create real-world impact — echoing principles discussed by Ilya:

Generalization, careful scaling choices, and integrating AI into coordinated, multi-agent ecosystems.

---

📌 Next Steps

Would you like a concise, high-level summary table of:

Problems identified
Potential solutions
Strategic implications for research & deployment
?

That would allow quick digestion of the most critical points Sutskever raised — and link them explicitly to practical examples like AiToEarn.