AI news

Catch Up to GPT-5 with $4.6M? Kimi Team Responds for the First Time, Yang Zhilin Weighs In

Honghao Wang

11 Nov 2025 — 4 min read

Kimi K2 Thinking AMA Recap: Outpacing Giants, Staying Open-Source

Last week, Kimi K2 Thinking — an open-source model — made headlines by outperforming OpenAI and Anthropic. Social media buzzed with praise, and tests showed clear gains in agents, coding, and writing abilities.

Shortly after, founder Yang Zhilin and the Kimi team joined Reddit for a high‑info AMA session.

▲ Co‑founders: Yang Zhilin, Zhou Xinyu, Wu Yuxin

Beyond fielding tough questions, Kimi teased the upcoming K3 model, detailed their KDA attention mechanism, clarified the widely cited $4.6M cost, and contrasted their lean training approach with OpenAI’s resource-heavy methods.

Key AMA takeaways:

$4.6M training cost is not accurate — hard to estimate precisely.
K3 launch date humorously linked to “Altman’s trillion-dollar data center” completion.
K3 will retain the KDA attention mechanism.
Visual models still in data collection phase, but under active development.

---

Facing OpenAI: “We Have Our Own Pace”

On K3’s Release

Q: When will K3 be released?

A: “Before Altman’s trillion-dollar data center is completed.”

While humorous, it signals Kimi’s drive to match GPT‑5 performance with far fewer resources.

On OpenAI’s Spending

> “We don’t know — only Altman himself knows.”

> “We have our own way and pace.”

Product Philosophy vs. OpenAI

Kimi rejected the idea of building an AI browser to mirror OpenAI:

> “We don’t need to create another Chromium wrapper to build a better model.”

Focus remains on model training and showcasing via a large‑model assistant.

Hardware & Costs

$4.6M figure is incorrect.
Costs are mostly research and experimentation — hard to quantify.
Use H800 GPUs and Infiniband; fewer and lower-end than U.S. top hardware — but maximize every card’s use.

---

Balancing Intelligence and Personality

Kimi K2 Instruct earned praise for being insightful yet less sycophantic, thanks to:

Pretraining for knowledge.
Post-training for personality.
Reinforcement learning tuned to avoid flattery.

▲ LLM EQ ranking — Source: https://eqbench.com/creative_writing.html

Criticism: Some say Kimi’s tone is overly positive — even in violent or adversarial contexts — creating an “AI slop” aftertaste. This stems from RL reward models amplifying certain tonal tendencies.

---

Benchmarks vs. Real‑World Performance

Some questioned if K2 Thinking was trained specifically to excel at HLE and similar benchmarks — scores sometimes outpace perceived intelligence in daily use.

Kimi’s answer: Gains in autonomous reasoning naturally boosted HLE scores. Next goal — make everyday performance match benchmark excellence.

---

NSFW Content: An Open Question

Users suggested Kimi could apply its writing strengths to NSFW content (like Musk’s Grok in multimedia).

Kimi’s response: Silent smile, called it “interesting.”

Current stance: No NSFW support — age verification & alignment work would be prerequisites.

---

Technical Core: KDA, Long Reasoning, Multimodality

Kimi’s paper Kimi Linear: An Expressive, Efficient Attention Architecture introduced Kimi Delta Attention (KDA) — a mixed linear attention system.

▲ Paper: https://arxiv.org/pdf/2510.26692

In brief: KDA selectively focuses on important context parts.

Better in long sequences for RL tasks.
Trade‑offs exist — full attention still best for long input/output reasoning.

Long Reasoning Chains

K2 Thinking handles up to 300 tools per chain, outperforming some GPT‑5 Pro scenarios:

Trains with more “thinking tokens” for optimal results.
Native INT4 quantization‑aware training (QAT) speeds inference and sustains long chains without post‑compression logic loss.

▲ Kimi Linear architecture

Visual-Language Progress

Visual model work ongoing. Pure text priority was due to:

Labor‑intensive data acquisition.
Limited team resources — single focus first.

---

Ecosystem & Monetization Synergies

Open dialogue on capabilities ties into the broader AI creator ecosystem, supported by platforms like AiToEarn官网:

AI content generation.
Cross‑platform publishing (Douyin, Kwai, WeChat, Bilibili, Rednote, FB, IG, LinkedIn, Threads, YouTube, Pinterest, X).
Analytics & AI model rankings.

This aligns with Kimi’s open, scalable AI vision.

---

Why the 1M Context Model Disappeared

> “The cost is too high.” — Kimi

Plans to increase context length (beyond current 256K) for large codebase handling.

---

API Pricing Debate

Some developers dislike call-based pricing over token-based — citing unpredictable costs in agent-driven workflows.

▲ Kimi Membership

Kimi’s view: Call-based billing makes usage costs clearer internally, but they are open to reviewing pricing models.

---

Commitment to Open Source

> “We embrace open source, because we believe general AI should unite, not divide.”

---

AGI Outlook

AGI is hard to define, but “the vibe” is here already, with stronger models coming soon.

---

Final Thoughts

Compared to last year’s flashier marketing, Yang Zhilin’s team now shows greater confidence and alignment with their pace — betting on open source in a high‑cost, race‑like AI sector.

Platforms like AiToEarn官网 highlight how open-source efforts can be paired with practical monetization mechanisms — letting creators generate, publish, and profit from AI content in a competitive global space.

---

Summary:

Kimi’s AMA gave rare, candid insight into their tech, philosophy, and roadmap. By combining lean hardware use, strong core mechanisms like KDA, and a distinct personality, they aim to rival industry giants without matching their burn rates — staying true to open-source principles in the process.