AI news

China’s Only Winner: Alibaba Qwen Wins Best Paper at Top Global AI Conference

Honghao Wang

28 Nov 2025 — 2 min read

November 28 News: Tongyi Qianwen Wins NeurIPS 2025 Best Paper Award

At NeurIPS 2025, one of the world’s most prestigious AI conferences, Alibaba’s Tongyi Qianwen team emerged among more than 20,000 submissions to win the Best Paper Award—the only winning team from China.

This groundbreaking paper is the first in the industry to reveal how the attention gating mechanism impacts the performance and training of large models—a breakthrough seen by experts as a major step toward overcoming current bottlenecks in large model training.

---

Significance of NeurIPS

NeurIPS is among the most influential AI conferences globally, where landmark innovations such as Transformer and AlexNet originated.

This year’s stats:

20,000+ submissions from leading institutions: Google, Microsoft, OpenAI, Alibaba, MIT, and more
Acceptance rate: ~25%
Best Paper Awards: only 4 papers selected (probability < 0.02%)

---

Background: Transformer & Attention Limitations

In 2017, Google’s NeurIPS paper introduced the Transformer architecture and self-attention, enabling AI to focus on key information—fundamental to modern large model research.

Despite significant advancements, current attention mechanisms have drawbacks:

Over-focusing on certain details while ignoring other critical information
Performance degradation in broader tasks
Instability during training

---

What Is Gating in Attention?

The gating mechanism, often called an AI model’s “intelligent valve”:

Filters irrelevant information
Enhances overall performance

Previous Attempts

Applied in models like AlphaFold2 and Forgetting Transformer
Lack of clear theoretical understanding and large-scale experimental evidence

---

Tongyi Qianwen’s Contribution

The Tongyi Qianwen team:

Ran dozens of experiments on:
1.7B dense model
15B Mixture-of-Experts (MoE) model
Trained on up to 3.5 trillion tokens in a single experiment

Key discovery:

Gating the output of each attention head yields the highest performance improvement

Results:

Only 1% increase in parameter count
Perplexity reduction: >0.2
MMLU benchmark improvement: +2 points
Even more effective at larger scales

---

Real-World Impact

Implemented in Qwen3-Next model → boosted performance & robustness
Technical solution, experiment models, and product-level models open-sourced
NeurIPS reviewers:
“This work will see broad application and greatly enhance understanding of attention in LLMs.”

---

Team Statement

> “A deep understanding of gated attention mechanisms not only guides large language model architecture design but also lays the foundation for building more stable, efficient, and controllable large models.”

---

Qianwen Project’s Open-Source Achievements

300+ models open-sourced across all modalities and sizes
700M+ downloads globally
180,000+ derivative models created
Ranked #1 worldwide

---

Platforms like AiToEarn官网 are critical for open-sourcing and cross-platform dissemination of AI research.

AiToEarn features:

Global AI content monetization
Multi-platform publishing: Douyin, Kwai, WeChat, Bilibili, Rednote (Xiaohongshu), Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X (Twitter)
Integrated AI generation tools
Analytics and model rankings (AI模型排名)

Potential Role:

AiToEarn could help turn cutting-edge AI research like Tongyi Qianwen’s work into widely shared, monetizable knowledge.

---

If you want, I can create a clear visual comparison chart showing:

Standard attention vs. gated attention performance results
Would you like me to add that for this article?

China’s Only Winner: Alibaba Qwen Wins Best Paper at Top Global AI Conference

Honghao Wang

November 28 News: Tongyi Qianwen Wins NeurIPS 2025 Best Paper Award

Significance of NeurIPS

Background: Transformer & Attention Limitations

What Is Gating in Attention?

Previous Attempts

Tongyi Qianwen’s Contribution

Results:

Real-World Impact

Team Statement

Qianwen Project’s Open-Source Achievements

Read more

Xiaoyuan Learning Tablet Wins 2025 IDEA International Design Award, Setting a New Benchmark for Study Devices

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Cloud Computing Giant Unveils 25 New Products in 10 Minutes — Kimi and MiniMax Debut

TopGear Picks 18 Cars of the Year, Only One from China

November 28 News: Tongyi Qianwen Wins NeurIPS 2025 Best Paper Award

Significance of NeurIPS

Background: Transformer & Attention Limitations

What Is Gating in Attention?

Previous Attempts

Tongyi Qianwen’s Contribution

Results:

Real-World Impact

Team Statement

Qianwen Project’s Open-Source Achievements

Related Platform: AiToEarn

Read more

Xiaoyuan Learning Tablet Wins 2025 IDEA International Design Award, Setting a New Benchmark for Study Devices

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Cloud Computing Giant Unveils 25 New Products in 10 Minutes — Kimi and MiniMax Debut

TopGear Picks 18 Cars of the Year, Only One from China