Production AI

Data-Driven Intelligent Diagnostic Systems: Multi-Agent Technology Implementation in Production Environments | AICon Beijing

Honghao Wang

24 Nov 2025 — 3 min read

AICon Beijing 2025 — Intelligent Multi-Agent Diagnostics

Date: 2025-11-24 13:41 (Beijing)

Explore the core of Alibaba Cloud’s Intelligent Diagnostics system and decode the practical path for multi-agent collaborative troubleshooting.

---

Event Overview

Dates: December 19–20, 2025

Location: Beijing

Theme: Exploring the Boundaries of AI Applications

Key Topics

Enterprise-level agent deployment
Context engineering
AI product innovation
Large-model applications enhancing R&D and business operations

The conference will feature experts from Alibaba, ByteDance, Huawei, JD.com, Kuaishou, Meituan, and top startups, sharing first-hand large model implementation practices. The goal: uncover more possibilities for enterprise AI and drive new business growth paths.

---

Featured Speaker

Zhao Qingjie — Alibaba Cloud Serverless Infrastructure Lead & AgentRun Product R&D Lead

Presentation: Data-Driven Intelligent Diagnostics System: Technical Implementation of Multi-Agent Systems in Production Environments

Session Focus

Traditional diagnostics struggle in highly dynamic, high-dimensional production environments. Zhao Qingjie will present Alibaba Cloud’s multi-agent system, combining:

Full-stack observability data (metrics, logs, traces, eBPF events)
LLM reasoning capabilities
Collaborative multi-agent architecture

This system enables end-to-end fault detection, root cause analysis, and repair suggestion generation in production — reducing:

MTTR by over 40%
False alerts by 65%
Manual interventions by 60%

---

Core Design Principles

Role-Based Agent Division: Sensor Agent, Reasoning Agent, Verification Agent, Execution Agent
Data–Model–Action Closed Loop: Converting real-time data into semantic context and automated actions
Seamless SRE Integration: Safely replacing manual inspections and alert responses

---

Speaker Background

Zhao Qingjie specializes in:

Serverless architecture
AI agents
PaaS development
Large-scale distributed systems

Previously: Core PaaS Platform Lead at Baidu, maintaining 80% of Baidu’s online services under high concurrency and high availability conditions.

---

Presentation Outline

Part 1 — Introduction

New challenges in intelligent operations
Complexity trends in cloud-native
From single-model assistance → multi-agent collaboration

Part 2 — System Architecture

Observability data layer
Agent coordination engine
Action execution layer

---

Intelligent Agent Roles

Perception Agent: Data collection & anomaly detection
Inference Agent: Root cause analysis
Validation Agent: Hypothesis testing
Execution Agent: Automated repair & recommendations

---

Data Closed Loop

Unifying metrics, logs, traces, eBPF events into a consistent agent context model.

---

Technical Implementation Highlights

Multi-Agent Communication: Dynamic task decomposition & consensus protocols
LLM + Domain Knowledge: Prompt design, tool invocation, hallucination suppression
Security & Reliability: Permission controls, audit logs, manual circuit breakers

---

Production Best Practices

Scenarios:

Serverless cold start anomalies
DB slow query storms
Container resource contention

Results:

MTTR ↓ 40%+
Invalid alerts ↓ 65%
Human intervention ↓ 60%

Lessons Learned:

From POC → large-scale deployment
Cold start optimization
Cost vs latency trade-offs

---

Summary & Outlook

Audience Takeaways:

Framework for enterprise-grade intelligent agent reliability
Patterns for multi-agent observability design
Integration with open-source diagnostic standards like OpenAgentTracing

---

Conference Extras

Additional sessions:

Software dev in the LLM era
Context Engineering
Large-model system engineering
Enterprise Agent design
SAR systems in large-model contexts
Multimodal AI innovations

Over 50 experts will share live insights at AICon Beijing.

Special Offer: 10% off tickets (save 580 RMB) — contact: 13269078023

Read Original

Open in WeChat

---

As multi-agent systems expand in enterprise automation, AI content generation, publishing, and monetization tools are vital.

AiToEarn官网 — An open-source global AI content monetization platform supporting:

AI content generation
Cross-platform publishing (Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X/Twitter)
Analytics & model ranking

Developers can explore the open-source repo for integrated AI tooling — empowering both creative and technical communities.

---

Would you like me to also create a clean infographic-style summary for the talk so it can be used as a one-page marketing PDF for AICon Beijing? That would make this even more conference-ready.

Data-Driven Intelligent Diagnostic Systems: Multi-Agent Technology Implementation in Production Environments | AICon Beijing

Honghao Wang

AICon Beijing 2025 — Intelligent Multi-Agent Diagnostics

Event Overview

Key Topics

Featured Speaker

Session Focus

Core Design Principles

Speaker Background

Presentation Outline

Intelligent Agent Roles

Data Closed Loop

Technical Implementation Highlights

Production Best Practices

Summary & Outlook

Conference Extras

Read more

Xiaoyuan Learning Tablet Wins 2025 IDEA International Design Award, Setting a New Benchmark for Study Devices

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Cloud Computing Giant Unveils 25 New Products in 10 Minutes — Kimi and MiniMax Debut

TopGear Picks 18 Cars of the Year, Only One from China

AICon Beijing 2025 — Intelligent Multi-Agent Diagnostics

Event Overview

Key Topics

Featured Speaker

Session Focus

Core Design Principles

Speaker Background

Presentation Outline

Intelligent Agent Roles

Data Closed Loop

Technical Implementation Highlights

Production Best Practices

Summary & Outlook

Conference Extras

Related Platforms — AI Content Monetization

Read more

Xiaoyuan Learning Tablet Wins 2025 IDEA International Design Award, Setting a New Benchmark for Study Devices

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Cloud Computing Giant Unveils 25 New Products in 10 Minutes — Kimi and MiniMax Debut

TopGear Picks 18 Cars of the Year, Only One from China