Karpathy Forms Large Model “Parliament” with GPT‑5.1, Gemini 3 Pro as Ultimate Think Tank

Karpathy Forms Large Model “Parliament” with GPT‑5.1, Gemini 3 Pro as Ultimate Think Tank

Large Models Competing Like Fighting Crickets

image

From short videos to AI models, people's content consumption habits are shifting again — toward speed and efficiency.

---

Changing Reading Habits

When facing long-form articles, academic papers, or large volumes of data, an increasing number of people no longer read start-to-finish.

Instead, they jump straight to high-density, quickly digestible knowledge — often by asking a large language model (LLM) to generate a summary.

A typical example: someone comments “@Yuanbao, summarize this” — a routine interaction in 2025.

This isn’t a flaw.

It’s an upgrade in human capability in the AI era — acquiring information faster and more efficiently than ever.

---

Karpathy’s Admission

Even AI leaders share this habit.

Andrej Karpathy — former OpenAI co-founder and Tesla AI Director — posted on X (Twitter) recently:

> "I’ve started using LLMs to read everything."

image

Like many, Karpathy combines personal insights with LLM summaries to deepen understanding.

---

The “LLM Parliament” Concept

With so many LLM options — each excelling in different areas — Karpathy wanted higher-quality results.

So he assembled four leading models into his own multi-model “LLM Parliament”.

image

How It Works

Karpathy describes the process as ambient computing:

  • Distribute the question to multiple models via OpenRouter:
  • `openai/gpt-5.1`
  • `google/gemini-3-pro-preview`
  • `anthropic/claude-sonnet-4.5`
  • `x-ai/grok-4`
  • Peer review:
  • Models see anonymized answers from others.
  • They review and rank them.
  • Final synthesis:
  • A Chairman LLM uses the ranked answers as context to produce the final output.

---

Comparison to PewDiePie’s Experiment

This is reminiscent of PewDiePie’s “Large Model Committee” experiment,

where 8 instances of the same model (with different prompts/personalities) produced answers and voted.

Difference: Karpathy uses different models, yielding greater diversity.

---

The “Cyber Cricket Fight”

Placing multiple LLM answers side-by-side — and letting them vote — is like watching a digital debate or AI cricket fight.

Sometimes one model openly admits another’s answer is better — making this approach both fun and a novel evaluation method.

Example:

When reading books, Karpathy’s parliament often rates GPT‑5.1 highest, Claude lowest, with Gemini and Grok in between.

Karpathy disagrees slightly — preferring Gemini’s concise summaries over GPT‑5.1’s verbosity, and noting Claude as overly minimalistic.

---

Multi-Model Workflow Beyond Fun

This collaborative answering style has real applications for content creators and analysts.

Platforms like AiToEarn官网 enable worldwide creators to:

  • Generate AI-assisted content
  • Publish across Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X (Twitter)
  • Analyze performance
  • Rank AI models (AI模型排名)

Such ecosystems fit naturally with LLM Parliament workflows — merging diverse AI outputs into powerful, cross-platform production pipelines.

---

Karpathy’s LLM Parliament – Three Key Stages

Stage 1: Initial Opinions

  • Send the user’s query to all models.
  • Collect and display responses in a tab view for easy comparison.

Stage 2: Peer Review

  • Each LLM sees anonymized responses from others.
  • Rank them based on accuracy and insightfulness.

Stage 3: Final Response

  • Council Chair LLM synthesizes all responses and rankings into the final answer.

---

Could This Be a Benchmark?

Some believe this multi-model council could evolve into a benchmarking tool:

image

However, the design space for multi-model integration is still wide open and underexplored.

---

Try It Yourself

Karpathy has open-sourced the project:

Note: No support is provided; the code is shared as-is and won’t be updated.

We previously used vibe coding to recreate a similar project with two deployed models.

Should we also consider open-sourcing ours?

---

Integrating Multi-Model Councils with AiToEarn

Tools like AiToEarn can:

  • Connect multi-model councils to content creation pipelines
  • Automate cross-platform publishing
  • Provide analytics and model ranking (AI模型排名)
  • Support a wider multi-model reasoning system

Full resource list: AiToEarn Docs →

Read more

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Harvard CS50: Introduction to Programming with R Harvard University offers exceptional beginner-friendly computer science courses. We’re excited to announce the release of Harvard CS50’s Introduction to Programming in R, a powerful language widely used for statistical computing, data science, and graphics. This course was developed by Carter Zenke.