Karpathy Forms Large Model “Parliament” with GPT‑5.1, Gemini 3 Pro as Ultimate Think Tank
Large Models Competing Like Fighting Crickets

From short videos to AI models, people's content consumption habits are shifting again — toward speed and efficiency.
---
Changing Reading Habits
When facing long-form articles, academic papers, or large volumes of data, an increasing number of people no longer read start-to-finish.
Instead, they jump straight to high-density, quickly digestible knowledge — often by asking a large language model (LLM) to generate a summary.
A typical example: someone comments “@Yuanbao, summarize this” — a routine interaction in 2025.
This isn’t a flaw.
It’s an upgrade in human capability in the AI era — acquiring information faster and more efficiently than ever.
---
Karpathy’s Admission
Even AI leaders share this habit.
Andrej Karpathy — former OpenAI co-founder and Tesla AI Director — posted on X (Twitter) recently:
> "I’ve started using LLMs to read everything."

Like many, Karpathy combines personal insights with LLM summaries to deepen understanding.
---
The “LLM Parliament” Concept
With so many LLM options — each excelling in different areas — Karpathy wanted higher-quality results.
So he assembled four leading models into his own multi-model “LLM Parliament”.

How It Works
Karpathy describes the process as ambient computing:
- Distribute the question to multiple models via OpenRouter:
- `openai/gpt-5.1`
- `google/gemini-3-pro-preview`
- `anthropic/claude-sonnet-4.5`
- `x-ai/grok-4`
- Peer review:
- Models see anonymized answers from others.
- They review and rank them.
- Final synthesis:
- A Chairman LLM uses the ranked answers as context to produce the final output.
---
Comparison to PewDiePie’s Experiment
This is reminiscent of PewDiePie’s “Large Model Committee” experiment,
where 8 instances of the same model (with different prompts/personalities) produced answers and voted.
Difference: Karpathy uses different models, yielding greater diversity.
---
The “Cyber Cricket Fight”
Placing multiple LLM answers side-by-side — and letting them vote — is like watching a digital debate or AI cricket fight.
Sometimes one model openly admits another’s answer is better — making this approach both fun and a novel evaluation method.
Example:
When reading books, Karpathy’s parliament often rates GPT‑5.1 highest, Claude lowest, with Gemini and Grok in between.
Karpathy disagrees slightly — preferring Gemini’s concise summaries over GPT‑5.1’s verbosity, and noting Claude as overly minimalistic.
---
Multi-Model Workflow Beyond Fun
This collaborative answering style has real applications for content creators and analysts.
Platforms like AiToEarn官网 enable worldwide creators to:
- Generate AI-assisted content
- Publish across Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X (Twitter)
- Analyze performance
- Rank AI models (AI模型排名)
Such ecosystems fit naturally with LLM Parliament workflows — merging diverse AI outputs into powerful, cross-platform production pipelines.
---
Karpathy’s LLM Parliament – Three Key Stages
Stage 1: Initial Opinions
- Send the user’s query to all models.
- Collect and display responses in a tab view for easy comparison.
Stage 2: Peer Review
- Each LLM sees anonymized responses from others.
- Rank them based on accuracy and insightfulness.
Stage 3: Final Response
- Council Chair LLM synthesizes all responses and rankings into the final answer.
---
Could This Be a Benchmark?
Some believe this multi-model council could evolve into a benchmarking tool:

However, the design space for multi-model integration is still wide open and underexplored.
---
Try It Yourself
Karpathy has open-sourced the project:
- GitHub: https://github.com/karpathy/llm-council
- X announcement: https://x.com/karpathy/status/1992381094667411768
Note: No support is provided; the code is shared as-is and won’t be updated.
We previously used vibe coding to recreate a similar project with two deployed models.
Should we also consider open-sourcing ours?
---
Integrating Multi-Model Councils with AiToEarn
Tools like AiToEarn can:
- Connect multi-model councils to content creation pipelines
- Automate cross-platform publishing
- Provide analytics and model ranking (AI模型排名)
- Support a wider multi-model reasoning system
Full resource list: AiToEarn Docs →