AI agents

Hands-on Test of MGX: Can a Team of Agents Coding Together Bring Us Closer to AGI than “Model-as-Agent”?

Honghao Wang

09 Oct 2025 — 5 min read

Multi-Agent Can Solve Many Problems — But How to Call Them Is the New Challenge

---

What Makes Humans Unique?

It’s hard to answer what truly sets humans apart from animals, but three essential traits have shaped our evolution as the “spirit of all living things”:

Tool Usage — From gripping a stick to igniting a flame, our ancestors set humanity on a path distinct from other species.
Division of Labor and Collaboration — Hunters hunted, craftsmen made tools, and society advanced through collective roles.
Reflection — Unlike animals that adapt slowly via genetic changes, humans can anticipate risks and iterate toward better solutions.

---

AI and the Mirror of Human Evolution

With large language models, we see echoes of human progress. GPT-5 didn’t entirely meet AGI expectations, prompting a rethink: how can we extend the capabilities of existing models?

A single model is like a talented but clumsy apprentice — able to write and code but poor at collaboration and self-correction. A new paradigm is emerging.

---

MGX: From Intelligence to Simulated Society

MGX is not a single model. Instead, it is a virtual team of multiple specialized agents:

Some handle requirement analysis
Others design architecture
Others code or perform in-depth research

These agents use tools, divide tasks, collaborate, reflect, and fix errors — just as humans do.

> If GPT is a replication of intelligence, MGX is a simulation of society.

DeepWisdom, MGX's developer, has proven technical credentials. They created OpenManus in three hours with five programmers, and their open-source multi-agent framework MetaGPT is widely recognized.

Founder/CEO Wu Chenglin previously led massive AI projects at Tencent, and authored MetaGPT.

---

1. Field Test — MGX’s AI Team in Action

MGX (MetaGPT X) is marketed as a “24/7 AI development team.” Users simply enter a requirement, and MGX automatically forms a virtual team.

Homepage Overview:

Yellow dot: Team Leader Mike
Blue dot: Engineer Alex
Purple dot: Product Manager Emma
Green dot: Data Analyst David
White dot: Architect Bob

Area B (input) lets you summon specific agents. Area C lets you disable the multi-agent mode.

---

Test 1: Building a Data-Driven Travel Website

Prompt:

> Create a National Day travel guide website. When users input a destination, the system automatically generates cultural, natural, and food routes.

Outcome:

Mike summarized the request.
Alex built a demo (only Beijing & Shanghai data).
Two-pane MGX view: left shows agent tasks, right previews the project.
Data Analyst David provided a Jupyter Notebook report (metrics, visualization, correlation analysis).
Emma wrote a detailed product requirements doc (user stories, competitor analysis, recommendation algorithm, commercialization).

> Example prompt:

> `@David 对全网国内热门城市的旅游景点进行数据分析，形成一份报告，并且辅助网站开发`

Emma proposed a clear recommendation framework:

> Algorithm idea: City types → assemble candidates → personalized re-ranking

> Extensible functions: `generateRoute` and `calculatePersonalizedScore`.

Final site version added richer content plus a scoring system.

Live demo: https://mgx-w6xvo6ydqlh.mgx.world

MGX also offers visual element selection for edits — eliminating randomness.

---

Test 2: Deep Research + Slide Output

Prompt:

> Compare Xiaomi 17 series and iPhone 17 series

Process:

MGX created a todo plan after analyzing requirements.
Delivered a 36-source research report (more domestic sources than GPT-5’s 18 international sources).

Report link: Download here

Key Findings

Positioning: Xiaomi 17 — aggressive specs & affordability; iPhone 17 — balanced stability & ecosystem.
Performance: Xiaomi’s Snapdragon 8 beats iPhone’s A19 Pro in multi-core; iPhone dominates single core.
Camera: Xiaomi excels in night/colour; iPhone wins in pro video tools.
Screen: Both premium; Xiaomi innovates more; iPhone refines features.
Battery/Charging: Xiaomi — bigger battery, faster charging; iPhone — optimized efficiency.

MGX then produced a slide deck. Quality lagged behind dedicated PPT tools (UI overlap issues), but MGX showed self-reflection:

Alex admitted poor UI rendering and missing visual comparisons.
Updated slides gained interactive charts.

Slide demo: https://mgx-yi53lrvz5ac.mgx.world

---

Test 3: "Tank Battle" Game — Many Agents, Many Problems?

We tested both all Agents forced in and MGX auto-selects Agents.

Prompt 1:

`做一款坦克大战游戏@Mike @Emma @Bob @Alex @David`

Prompt 2:

`做一款坦克大战游戏`

Expected roles:

Mike — overall coordination
Emma — game design
Bob — architecture
David — data support
Alex — implementation

Result:

Multi-agent version collapsed into chaotic role overlap (e.g., data analyst coding).

Solo Alex’s version worked.

Multi-agent game: View here

Solo build: View here

---

Lessons Learned

Coordination gaps emerged when Agents lacked clear routing.
Effective role enforcement is crucial for success.
Multi-agent systems shine in diverse skill tasks but need robust SOPs for complex, synchronous projects.

Future multi-agent systems must implement:

Dynamic task division
Intelligent routing
Self-evaluation
Memory management
Cross-environment execution

---

Broader Implications

Multi-agent approaches lower complexity barriers, enabling users to delegate entire projects to an AI team — pushing AI toward organizational intelligence.

---

Platforms like AiToEarn官网 apply similar multi-agent orchestration principles to AI-generated content:

Cross-platform publishing — Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Instagram, YouTube, X, etc.
Integrated analytics & model rankings (AI模型排名)
Open-source collaboration: AiToEarn开源地址

---

💗 Tap the “heart” before you go

---

Bottom Line

MGX demonstrates: multi-agent teams can deliver richer, higher-quality outcomes than single models.
But coordination matters: without strict role routing, performance drops.
Next-gen direction: smarter orchestration, dynamic tasking, and continuous self-improvement.

---

Would you like me to create a compact, one-page summary table of all MGX test results so they’re easier to compare at a glance? This would make your article even more reader-friendly.