Doing the Hard but Right AI Infra Innovation — Interview with Liu Tongxuan, Head of the xLLM Domestic Large Model Inference Engine Community

Honghao Wang

02 Dec 2025 — 3 min read

xLLM: Pioneering Open-Source AI Infrastructure in China

In the age of rapidly advancing domestic large models like DeepSeek, AI Infrastructure (AI Infra) has become as essential as water, electricity, and gas in the digital era.

For years, this critical field was dominated by overseas frameworks such as vLLM and TensorRT-LLM.

That began to change at the end of August, when a young team introduced xLLM — a homegrown inference engine designed to bridge domestic chips and large model applications.

---

December 6: Offline Meetup in Beijing

The xLLM community — founded just three months ago — is hosting a Beijing offline Meetup with the theme:

> “Co-building an Open-Source AI Infra Ecosystem”

Positioned as the central nervous system of AI Infra, xLLM operates much like an intelligent operating system, seamlessly connecting:

Bottom layer: Domestic chips and hardware
Top layer: Large model applications

This enables efficient translation of computing power into real-world intelligent capabilities.

---

The Origin Story: Breaking the Ice for Chinese AI Infra

Liu Tongxuan, project lead, describes the key moment when xLLM’s path was chosen:

> “We were at a crossroads — do we follow the well-trodden path of vLLM or Sglang, or build a brand-new engine tailored for domestic chips?”

The Decision

Option A: Incrementally optimize established frameworks
Option B: Build from scratch — harder, but right

The team chose Option B, setting out to design an engine matching the highest international standards, but deeply optimized for domestic hardware.

---

Key Differentiators

xLLM vs. vLLM / Sglang:

Supports large models and multimodal models:
Text generation
Generative recommendation
Text-to-image & text-to-video AI Generated Content (AIGC)
Hardware compatibility:
Deep optimization for multiple domestic chips
Superior performance on Ascend hardware compared to vllm-ascend
Complete tech stack:
Fully open-sourced large model server components
Features like global request dispatch and dynamic PD separation
From inference engine → full inference services

---

The Team Behind the Breakthrough

Average age: Under 30
Mostly post-95 engineers with just 2–3 years of experience
Achieved core engine build through perseverance and grit

> “A group of young people, powered by determination, took on a challenge few dared to face,” Liu notes proudly.

---

Breaking the Cocoon: Three Strategies for xLLM's Ecosystem Moat

After nearly a year of R&D, xLLM launched on GitHub at the end of August — quickly igniting community interest.

> “We didn’t expect such attention. It’s still early, with much room to improve architecture and usability,” Liu admits.

Key Early Wins

First deployment in Hangzhou Computing Center
Alignment with State Council’s 'AI+' initiative — accelerating efficient model training and inference
Real-world applications including Xinjiang power plant deployments on domestic integrated machines

---

2025 Roadmap

xLLM’s focus for the next year:

Scenario Depth
Breakthroughs in complex generative systems:
Text-to-video
Generative recommendation
Model Alliance
Deepening partnerships with domestic large model vendors
Rapid response to frontline model updates
Chip Synergy
Extreme performance tuning for domestic chips

Ultimate Goal:

Evolve into a data-center-level “intelligent operating system” under the AI for System paradigm.

---

Ecosystem Synergy: AiToEarn

Open platforms like AiToEarn官网 complement innovations like xLLM by:

AI content generation & publishing across platforms:
Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, and X (Twitter)
Analytics tools for performance tracking
AI model ranking (AI模型排名)

This connection between core technical breakthroughs and global content reach amplifies the real-world impact of AI innovation.

---

Final Word from Liu Tongxuan

> “From filling ecosystem gaps to powering Xinjiang’s energy infrastructure, xLLM is turning bottlenecks into accelerators.”

This young team is redefining China’s position in global AI Infra:

From technology followers → to standard setters
Every line of code on GitHub
Every optimized power plant system
A direct contribution to the national Artificial Intelligence Plus strategy

---

Bottom line:

By combining high-performance domestic AI Infra (xLLM) with open content monetization ecosystems (AiToEarn), China is building a self-reliant, globally competitive technological foundation.