Musk Quietly Releases Grok 4.1, Dominates All Leaderboards in the Large Model Arena

Musk Quietly Releases Grok 4.1, Dominates All Leaderboards in the Large Model Arena

Grok 4.1: Elon Musk’s AI Leap to the Top of the Leaderboard

Just now, Elon Musk released Grok 4.1, simultaneously taking first and second place in the Large Model Arena rankings.

---

🚀 How Did They Pull It Off?

image
  • Grok 4.1 Thinking Mode leads the chart with an Elo score of 1483, topping non-xAI models by +31 points.
  • Grok 4.1 Non-Thinking Mode secures second place with 1465 points — outperforming the full reasoning modes of all other public leaderboard models.
image

Just months earlier, the previous Grok 4 ranked only 33rd. In less than half a year, xAI has made a massive leap forward.

---

Dominating New “Expert” & “Professional” Rankings

In the newly added Expert and Professional leaderboards, Grok 4.1 Thinking Mode also dominates.

image

Expert Leaderboard

Contains questions expected to be asked only by top specialists in their fields.

Professional Leaderboard

Split into eight subcategories:

  • Software & IT Services
  • Writing, Literature & Language
  • Life Sciences, Physical Sciences & Social Sciences
  • Entertainment, Sports & Media
  • Business, Management & Financial Operations
  • Mathematics, Legal & Government
  • Healthcare

---

Performance Summary:

  • Grok 4.1 ranks #1 in six of eight categories.
  • Only loses to Gemini 2.5 in Literature, and to Claude 4.5 and o3 in Mathematics.
image

> Note: Scores are still labeled Preliminary due to low vote counts. Reliability will improve as more votes are collected.

---

Strong Emotional Intelligence: EQ-Bench Test

In EQ-Bench (an emotional intelligence assessment by LLMs), Grok 4.1 outperforms Kimi K2 (non-Thinking version).

image

EQ-Bench Measures:

  • Proactive social awareness
  • Emotional comprehension
  • Insight & empathy
  • Interpersonal skills

---

Quiet Rollout & User Preference Testing

Grok 4.1 was tested silently starting November 1, with gradual rollout via blind A/B trials.

  • 64.78% of users preferred the new model.
image

xAI’s official site now offers side-by-side comparisons between Grok 4.1 and earlier versions.

---

Example Comparisons

Emotional Response

image

Creative Writing

image

---

Technical Improvements

According to xAI’s technical report, Grok 4.1 delivers:

  • Better creativity
  • Enhanced emotional engagement
  • Improved collaborative interaction
  • Subtle intent detection
  • Consistent personality maintenance
  • Retained intelligence and reliability of Grok 4

Reinforcement Learning at New Scale

Dust Tran, head of post-training at xAI, explained:

> Our small team rebuilt reinforcement learning algorithms using real user conversation data, combined with scoring from strong reasoning reward models.

> We scaled RL by an order of magnitude, far beyond Grok 4’s pre-training scale.

---

Fast-Response Mode & Hallucination Reduction

  • Fast-response mode skips chain-of-thought reasoning:
  • Avg. token count drops from ~2,300 to ~850.
  • Special post-training focus on reducing factual hallucinations in information retrieval prompts.

Measured Improvements:

  • FActScore test (500 biography questions) shows clear gains in non-reasoning mode vs. Grok 4.

---

Why Grok 4.1 Matters

In the rapidly evolving AI landscape, Grok 4.1 shows how:

  • Advancements in RLHF
  • Emotional intelligence tuning
  • Personality alignment

can quickly boost real-world performance.

For creators aiming to stay competitive, platforms like AiToEarn官网 provide:

  • AI-powered content generation
  • Instant multi-platform publishing
  • Monetization & analytics tracking
  • Model ranking insights
image

---

Multi-Modal Output Capability

Grok 4.1 can produce rich image + text responses:

image

---

Availability

Grok 4.1 is available:

  • On grok.com
  • X platform
  • iOS & Android apps

> Tip: It launches in automatic mode but can be manually selected in the model picker.

---

References

  • https://x.ai/news/grok-4-1
  • https://x.com/arena/status/1990530984014676155
  • https://x.com/dustinvtran/status/1990532663258853720

---

For AI Creators

Platforms like AiToEarn官网 help leverage the multi-modal power of Grok 4.1:

  • Generate AI content
  • Publish across major platforms
  • Track analytics & monetize outputs effectively

---

Do you want me to also create a summary leaderboard table for Grok 4.1’s wins versus competitors? That could make the rankings section even clearer.

Read more

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Harvard CS50: Introduction to Programming with R Harvard University offers exceptional beginner-friendly computer science courses. We’re excited to announce the release of Harvard CS50’s Introduction to Programming in R, a powerful language widely used for statistical computing, data science, and graphics. This course was developed by Carter Zenke.