Data-Driven Intelligent Diagnostic Systems: Multi-Agent Technology Implementation in Production Environments | AICon Beijing
AICon Beijing 2025 — Intelligent Multi-Agent Diagnostics
Date: 2025-11-24 13:41 (Beijing)

Explore the core of Alibaba Cloud’s Intelligent Diagnostics system and decode the practical path for multi-agent collaborative troubleshooting.

---
Event Overview
Dates: December 19–20, 2025
Location: Beijing
Theme: Exploring the Boundaries of AI Applications
Key Topics
- Enterprise-level agent deployment
- Context engineering
- AI product innovation
- Large-model applications enhancing R&D and business operations
The conference will feature experts from Alibaba, ByteDance, Huawei, JD.com, Kuaishou, Meituan, and top startups, sharing first-hand large model implementation practices. The goal: uncover more possibilities for enterprise AI and drive new business growth paths.
---
Featured Speaker
Zhao Qingjie — Alibaba Cloud Serverless Infrastructure Lead & AgentRun Product R&D Lead
Presentation: Data-Driven Intelligent Diagnostics System: Technical Implementation of Multi-Agent Systems in Production Environments
Session Focus
Traditional diagnostics struggle in highly dynamic, high-dimensional production environments. Zhao Qingjie will present Alibaba Cloud’s multi-agent system, combining:
- Full-stack observability data (metrics, logs, traces, eBPF events)
- LLM reasoning capabilities
- Collaborative multi-agent architecture
This system enables end-to-end fault detection, root cause analysis, and repair suggestion generation in production — reducing:
- MTTR by over 40%
- False alerts by 65%
- Manual interventions by 60%
---
Core Design Principles
- Role-Based Agent Division: Sensor Agent, Reasoning Agent, Verification Agent, Execution Agent
- Data–Model–Action Closed Loop: Converting real-time data into semantic context and automated actions
- Seamless SRE Integration: Safely replacing manual inspections and alert responses
---
Speaker Background
Zhao Qingjie specializes in:
- Serverless architecture
- AI agents
- PaaS development
- Large-scale distributed systems
Previously: Core PaaS Platform Lead at Baidu, maintaining 80% of Baidu’s online services under high concurrency and high availability conditions.
---
Presentation Outline
Part 1 — Introduction
- New challenges in intelligent operations
- Complexity trends in cloud-native
- From single-model assistance → multi-agent collaboration
Part 2 — System Architecture
- Observability data layer
- Agent coordination engine
- Action execution layer
---
Intelligent Agent Roles
- Perception Agent: Data collection & anomaly detection
- Inference Agent: Root cause analysis
- Validation Agent: Hypothesis testing
- Execution Agent: Automated repair & recommendations
---
Data Closed Loop
Unifying metrics, logs, traces, eBPF events into a consistent agent context model.
---
Technical Implementation Highlights
- Multi-Agent Communication: Dynamic task decomposition & consensus protocols
- LLM + Domain Knowledge: Prompt design, tool invocation, hallucination suppression
- Security & Reliability: Permission controls, audit logs, manual circuit breakers
---
Production Best Practices
Scenarios:
- Serverless cold start anomalies
- DB slow query storms
- Container resource contention
Results:
- MTTR ↓ 40%+
- Invalid alerts ↓ 65%
- Human intervention ↓ 60%
Lessons Learned:
- From POC → large-scale deployment
- Cold start optimization
- Cost vs latency trade-offs
---
Summary & Outlook
Audience Takeaways:
- Framework for enterprise-grade intelligent agent reliability
- Patterns for multi-agent observability design
- Integration with open-source diagnostic standards like OpenAgentTracing
---
Conference Extras
Additional sessions:
- Software dev in the LLM era
- Context Engineering
- Large-model system engineering
- Enterprise Agent design
- SAR systems in large-model contexts
- Multimodal AI innovations
Over 50 experts will share live insights at AICon Beijing.
Special Offer: 10% off tickets (save 580 RMB) — contact: 13269078023

---
Related Platforms — AI Content Monetization
As multi-agent systems expand in enterprise automation, AI content generation, publishing, and monetization tools are vital.
AiToEarn官网 — An open-source global AI content monetization platform supporting:
- AI content generation
- Cross-platform publishing (Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X/Twitter)
- Analytics & model ranking
Developers can explore the open-source repo for integrated AI tooling — empowering both creative and technical communities.
---
Would you like me to also create a clean infographic-style summary for the talk so it can be used as a one-page marketing PDF for AICon Beijing? That would make this even more conference-ready.