Ilya Speaks Out: The "Brute Force" Era of Large Models Is Over
AI’s Shift: From Scaling Back to Research
> “AI is moving from the scaling era back toward the scientific research era.”
> — Ilya Sutskever
This viewpoint comes from Ilya’s recent ~20,000-word interview, touching on nearly every hot topic in AI today:
- Why AI still lags behind humans in generalization
- Safety and alignment challenges
- Limitations within the pre-training paradigm
Ilya believes the mainstream “pre-training + scaling” approach has hit a bottleneck. His call: stop chasing size blindly and focus on rethinking the research paradigm.
Many in the community agree — but for long-time critics like Yann LeCun (“LLMs are dead”), the déjà vu is frustrating.

LeCun even reshared a meme implying:
> “So when I said it, nobody cared?”

---
The Conversation Highlights
Setting the Scene
Ilya reflects on the surreal, sci-fi feel of AI’s boom — and the paradox: explosive investments haven’t yet translated into proportionate tangible change for everyday life.
> Observation: AI feels abstract to the general public because announcements remain “just big numbers” rather than experiences.
---
Benchmark vs Real-World Impact
- Models excel in benchmarks yet struggle with real-world reliability.
- “Vibecoding” bug example: model alternates between two errors without resolution.
- Possible cause: RL training creates over-focused, single-target behavior.
---
From “All the Data” to RL Environments
- Early pre-training: use all available data — no need to choose.
- RL era: must design specific environments for capabilities.
- Risk: over-optimizing for evaluation metrics → reward hacking by researchers, not just models.
---
Generalization Gap
Two main interpretations:
- Expand environments — test beyond contests, into building real applications.
- Invent ways for skills to transfer across domains — true general capabilities.
---
Competitive Programming Analogy
- Student A: 10,000 hours in one niche → top competitor.
- Student B: 100 hours + broader experience → better career.
- Overtraining in narrow domains hurts generalization — mirrors current AI pitfalls.
---
The Two Big Advantages of Pre-Training
- Massive, natural dataset (human activities, world projections into text)
- No need to choose subsets — “take it all”
Challenge: Hard to fully understand how models leverage such data; possible key gaps in knowledge representation.
---
Evolutionary Priors & Learning Ability
- Certain human skills (e.g., dexterity, vision) deeply rooted in evolution.
- Language/programming are recent — yet humans still show sample-efficient learning there.
- Points to a general machine learning ability in humans, beyond evolutionary priors.
---
Human-Like Continuous Learning
- Teen drivers self-correct without explicit rewards.
- Models lack comparable internal value systems.
- Achieving this in AI may be possible — but involves ideas not yet openly shareable.
---
Scaling & Recipe Thinking
- 2012‒2020: Research era
- 2020‒2025: Scaling era
- Next: research again — but now with giant compute.
- Pre-training “recipe” proved the scaling law.
- Hard limit ahead: finite data → shift to reinforced pre-training, RL, or new paradigms.
---
Efficiency & Value Functions
- RL consumes extreme compute; efficiency needs value functions.
- Generalization remains the core unsolved problem.
---
Sample Efficiency & Learning Understanding
- Humans learn new motor skills far faster than robots/models.
- Possible explanations: evolved priors + intrinsic value systems.
---
Deployment, Market Forces, and Safety
Gradual vs Direct Superintelligence
- Benefits of market competition avoidance vs public safety learning through deployment.
- Continuous rollout allows systems to improve robustness.
---
Economic Impact & Intelligence Explosion
- AI that can both learn like humans and scale like computers could drive rapid economic growth.
- Deployment speed may vary by national regulation.
---
Strategic Research & Compute Use
- SSI focuses compute on research, not inference-heavy product demands.
- Differentiated research can be proven without extreme scale.
- Key: how compute is used matters more than sheer volume.
---
Diversity in AI Systems
- Pretraining yields similar models; RL/fine-tuning drives differentiation.
- Self-play useful for negotiation, strategy, adversarial skills — but limited.
- Diverse approaches foster robustness.
---
The Role of “Research Taste”
- Aesthetic conviction: beauty, simplicity, elegance, brain-inspired features.
- Guides persistence when experiments contradict intuition.

---
Practical Application: AiToEarn Platform
In parallel to the theoretical debate, AiToEarn官网 exemplifies how AI capabilities can be deployed today:
- Open-source global AI content monetization
- Generate, publish, earn across Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X
- Integrates AI generation tools, cross-platform distribution, analytics, AI模型排名
- Resources: AiToEarn博客 · AiToEarn文档 · GitHub开源
This operational scaling shows how multi-domain adaptability, diversity, and efficiency can create real-world impact — echoing principles discussed by Ilya:
Generalization, careful scaling choices, and integrating AI into coordinated, multi-agent ecosystems.
---
📌 Next Steps
Would you like a concise, high-level summary table of:
- Problems identified
- Potential solutions
- Strategic implications for research & deployment
- ?
That would allow quick digestion of the most critical points Sutskever raised — and link them explicitly to practical examples like AiToEarn.