# QCon 2025 – AIGC-Powered Visual Generation in E-Commerce
*Date*: **2025-11-03 13:31 Beijing**
*Speaker*: **Li Yan**, Head of Visual & AIGC Department, JD Retail
*Editor*: **Kitty**
*Planning*: **QCon Global Software Development Conference**
---
## Introduction
In the wave of **AIGC (AI-Generated Content)** reshaping all industries, **visual generation technology** is becoming a key driver in rebuilding the e-commerce ecosystem.
When e-commerce shifts from **“product display”** to **“content-driven”**, brands face an unprecedented demand for **massive, diverse, and highly targeted visual assets**. Traditional manual production cannot keep pace with modern efficiency and cost requirements.
**Large model–driven AIGC** offers a breakthrough:
- Bulk generation of product images and videos.
- Hyper-personalized materials for specific user profiles.
- 90% reduction in production costs.
- 30% increase in conversion rates.
**Dr. Li Yan** explains the technical framework and deployment strategies for *extreme personalization* (one product, a thousand looks) in the e-commerce 2.0 era — supported by **two core models**, merchant enablement practices, and a forward-looking outlook.
---
## Talk Outline

1. **History of E-Commerce**
2. **Key Features of E-Commerce 2.0**
3. **Personalized Material Generation (“Thousand Users, Thousand Materials”)**
4. **AIGC Technologies Empowering Merchants**
---
## Evolution of E-Commerce

- **Pre-1960s**: Offline trades only; in some places barter remained common.
- **1960s–1990s**: Computing advances → EDP/EDI systems → Digital marketplaces like **Amazon**, **JD.com**, **Alibaba**.
- **Post-2005**: Mobile internet era → Shelf-based + Content-based e-commerce → personalized search/recommendations (“One Person, One Interface”) → **E-Commerce 1.0**.
- **2022 onwards**: ChatGPT, Midjourney, large models, embodied intelligence, 3D/XR → **E-Commerce 2.0**.
---
## E-Commerce 2.0 Characteristics
1. **Supply–Demand Matching Shift**
From "people search for goods" → "goods search for people" using large models and behavioral profiling.
2. **Optimized Supply Chain**
Dynamic scheduling, drones/unmanned delivery for last-mile fulfillment.
3. **Full-Process AI Services**
AI-driven pre-sales to after-sales with multimodal large models.
4. **Immersive Shopping Experiences**
Beyond 2D interfaces into interactive, virtual environments.
5. **Extreme Personalization**
From **search personalization** (*thousand users, thousand faces*) →
**product content personalization** (*thousand users, thousand materials*).
---
## Thousand Users, Thousand Materials – Concept

- **Functionality-Oriented Buyer**: Performance specs emphasized.
- **Style-Focused Buyer**: Aesthetic pairing and OOTD examples.
- **Price-Sensitive Buyer**: Promotions and discounts highlighted.
---
## Technical Pipeline Overview

### Input
- **Goods data**: Product images, specifications, Q&A, reviews, metadata, external knowledge via RAG.
- **Person data**: User behavior history, profile data, external context if available.
### Process
1. **Understanding Core**: E-commerce-specific multimodal LLM generates presentation instructions.
2. **Generation Core**: Controllable visual model outputs multiple asset versions.
3. **Efficiency Core**: Filters low-quality results.
4. **Distribution Core**: Manages traffic allocation via recommendation/search systems.
5. **Feedback Loop**: User data improves all cores.
### Practical Adaptation
- Current reality is closer to **“thousand users, hundred materials”** or **“thousand users, ten materials”** due to scalability limits.
---
## Implementation Case – JD Jingzao Black Coffee

### Steps
1. Feed product metadata + system instructions to multimodal LLM.
2. Identify potential consumer types (fitness, office, students, low sugar, outdoors).
3. Generate audience-specific scenarios.
4. Visual generation yields personalized materials ready for targeted ads.
---
## Core Models
### 1. E-Commerce Retail Multimodal Understanding Model

- **Framework**: Vision-Language Model (VLM).
- **Architecture**: MoE-structured Decoder-only LLM.
- **Challenge**: Activate retail reasoning without degrading general performance.

- **Post-Training**: GRPO strategy with reward dimensions (logic, clarity, semantic similarity, format accuracy).

- **CoT Sample**: Multimodal product info reasoning chain.

- **OxygenVLM Evaluation**: Match general benchmarks, excel in retail-specific tasks.
---
### 2. Controllable Visual Generation Model

- **Multi-condition diffusion**: Inputs include timestep, text prompt, product subject, layout, stickers.
- **Future Direction**: Unify control signals into natural language; tightly couple understanding + generation.
---
## Merchant Pain Points

1. **High SKU Volume**: Thousands per store, tens of thousands per buyer.
2. **Budget Constraints**: Visual production costs are significant.
3. **Frequent Campaigns**: Uncertain ROI due to resource variability.
---
## JD Diandian AIGC Content Platform

- **Scenarios Served**: 30+ business cases.
- **Merchants Supported**: 800,000+.
- **Usage**: 10+ million daily invocations.
- **Efficiency Gains**: 95%+ improvement, major cost reduction.
---
## OxygenVision – New JD Diandian Platform

### Upgrades
1. Natural language interaction.
2. LLM-driven task planning.
3. Algorithmic product consistency.
4. Integrated AB testing.
---
## Multi-Agent System Architecture

- 10% LLM, 90% software engineering: State management, load balancing, memory, event bus, context processors.
---
## Upcoming Features
1. **Batch Material Generation**
2. **Video Generation** (5s product clips, 30s marketing videos)
3. **Business Outcome–Driven Generation**
4. **External Merchant Support** (multi-language, cross-border)
Try at: [oxygen-vision.jd.com](http://oxygen-vision.jd.com)
---
## AI Content Ecosystem – AiToEarn
In parallel, open-source platforms like **[AiToEarn官网](https://aitoearn.ai/)**
enable AI-powered generation + cross-platform publishing for Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, and X (Twitter).
**Features**:
- Integrated AI creation tools.
- Multi-platform distribution.
- Analytics & model ranking.
- Open-source resources at [GitHub](https://github.com/yikart/AiToEarn).
---
## Conference Preview
**AICon 2025 Final Stop** – Beijing, Dec 19–20
Topics: Agents, context engineering, AI product innovation, expert exchanges.
---
[Read Original](2651261391)
[Open in WeChat](https://wechat2rss.bestblogs.dev/link-proxy/?k=85df4338&r=1&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%3F__biz%3DMjM5MDE0Mjc4MA%3D%3D%26mid%3D2651261391%26idx%3D2%26sn%3De480e97df9399dcae71fce0bad13296a)