Production AI

From Model to Agent: Snowflake’s Enterprise-Grade Agentic AI Engineering Journey

Honghao Wang

24 Nov 2025 — 5 min read

QCon 2025 — Agentic AI Deployment in Enterprises

Date: 2025‑11‑24 · Location: Beijing

---

As Large Language Models (LLMs) evolve toward Agentic AI, enterprises must navigate significant challenges — security, efficiency, and trust — on the journey from proof‑of‑concept to large‑scale deployment.

Without a solid data foundation and systematic engineering methodologies, AI risks remaining a theoretical capability instead of becoming actionable business assets.

At QCon Global Software Development Conference 2025 (Shanghai), Yang Yang — VP, Solutions Engineering (APAC & Japan) at Snowflake — shared how Snowflake’s R&D supports enterprise‑scale Agentic AI deployments, reshaping intelligent productivity and enabling the leap from large language models to controllable intelligent agents.

---

Introduction

> Speaker’s note:

> Moving from proof‑of‑function to enterprise scale is a long and complex journey.

> In this talk, I’ll explain the five R&D pillars Snowflake developed to enable trustworthy, efficient, and scalable Agentic AI for enterprises.

About Snowflake:

Founded 13 years ago
Full data + AI platform built entirely on public cloud
Capabilities: multilingual dev, data modeling, data engineering, analytics, AI apps, and secure sharing
12,000+ enterprise customers worldwide; over 50% use our AI products
Widely adopted among Fortune 2000 companies
Ranked #1 in Fortune Future 50 (2025)

---

The 5 Core Pillars of Snowflake AI R&D

1. Intelligent Agent Orchestration

Key Idea: Seamlessly link diverse tools across multiple environments, assigning the right task to the right tool.

Enterprise requirements:

Accurate, secure reading of structured + unstructured data
Strong observability and user trust
Performance optimization to keep costs under control

Analogy: China’s high‑speed rail owes success not just to speed but to its network scale and central coordination system — ensuring passengers arrive both safely and efficiently.

Snowflake implementation:

Cortex Analyst and Cortex Search for specialized tasks
Intelligent orchestration automates:
Task decomposition
Execution planning
Routing to correct toolset
Real‑time optimization as new info arrives

Example:

> “Why did our dashboard data drop on April 5th?”

System decomposes → verifies data drop → checks historical levels → factors in date context → concludes drop due to weekend traffic.

Design philosophy:

High scalability across industries (commercial, healthcare, etc.)
Academic validation: Healthcare integration boosted Alzheimer’s prediction accuracy to 93.26%.

Related resource: AiToEarn官网 — open-source platform enabling AI content creation, cross-platform publishing, analytics, and monetization.

---

2. Structured Data with Intelligence

Challenge: LLMs can generate SQL easily, but real enterprise use requires:

Resolving ambiguous queries
Locating precise data in massive schemas
Validating correctness

Solution — ReFoRCE system:

Compress/optimize DB schema automatically
Use automated voting between SQL candidates for highest accuracy
Iteratively expand queries until correct result found

Impact:

+20% efficiency in SQL execution
Ranked #2 in Spider Lite text‑to‑SQL benchmark (Sep 2025), behind Tsinghua University

Case Study — AT&T:

140,000+ employees; 100,000+ benefiting from AI tools
90+ fine-tuned small models; 410 work units; 71 RAG apps; 450M daily API calls
Schema optimization cut token usage from 7M → 156K
Column vectorization for rapid similarity querying in vector DB

---

3. Unstructured Data Intelligence (VerDICT)

Goal: Extract precise answers and avoid hallucinations.

VerDICT (Verified Diversification with Consolidation):

Dual‑verification process:
Retriever: relevance feedback — filters irrelevant interpretations
Generator: answerability feedback — ensures answer fully addresses question

Example: Query “What is HP?” → eliminate irrelevant meanings (Harry Potter) → verify final answers match business context.

Accuracy:

93% with VerDICT
Beats Llama 3.3 and GPT‑4 on unstructured data
Human baseline ~65%

---

4. Traceability & Trustworthiness

Necessity in enterprise AI:

Accuracy
Effectiveness
Compliance & ethics

Snowflake approach:

Full end‑to‑end evaluation → show results for each step
Cross‑environment/model comparison
OpenTelemetry support for full execution trace

Recommendation: Present content relevance, data reliability, and answer accuracy clearly — key to building user trust.

---

5. System Optimization

Performance metrics for enterprise AI:

Responsiveness (first-token latency)
Generation speed
Throughput (user volume + budget impact)

Limitations in existing methods:

Tensor Parallel: Great latency, poor throughput
Data Parallel: Great throughput, weaker latency

Snowflake innovation:

Arctic Sequence Parallel + Tensor Parallel = Shift Parallelism
Real‑time mode switching based on batch size
KV data layout compatibility
Results:
3.4× faster end-to-end speed
1.7× higher throughput
16×+ gains for embeddings
Open source — community contributions welcome

---

Applying the 5 Pillars — Snowflake Cortex AI

Cortex centralizes AI capabilities:

Handles structured + unstructured data, voice, images, documents
Role-based permissions for safety/compliance
Guardrails to control model access (e.g., production uses LLaMA, R&D uses Mistral)
Integrated models: OpenAI, Anthropic, Meta, Mistral, DeepSeek, Snowflake Arctic
Tools: Cortex Analyst, Cortex Search, AISQL (queries structured + unstructured seamlessly)
API-first for dev integration

---

Demonstration Workflow — Building Quality Issue Management

Scenario: Industrial park managers receive tenant reports of structural issues (photos, descriptions).

Workflow:

Store & index thousands of defect images securely in Snowflake
AISQL query: “Analyze defect images and recommend repair products”
Auto-indexing correlates defects with product DB → outputs list + prices
Business GUI asks follow-up: suppliers, cost, procurement plan, efficiency strategies
Cortex Analyst + Cortex Search merge results
Transparent observability for trust
Final recommendations include sourcing & budget reasoning

Result:

Minutes to process thousands of images
One SQL statement suffices
Fully secure, no cross-system data transfers

---

Conclusion

Snowflake’s five core pillars enable secure, efficient, and scalable Agentic AI deployments:

Intelligent Orchestration — task breakdown & routing
Structured Data Intelligence — optimized querying with ReFoRCE
Unstructured Data Processing — VerDICT verified accuracy
Traceability & Trust — full observability with OpenTelemetry
System Optimization — Shift Parallelism for peak performance

---

Event Recommendation

AICon Global Artificial Intelligence Development & Application Conference (Beijing)

Dates: December 19–20

Topics: LLM training & inference, AI Agents, new dev paradigms, organizational transformation

---

Related Resource: AiToEarn官网 — open-source global AI content monetization platform enabling creators to integrate AI content generation with cross-platform publishing, analytics, and ranking, for channels like Douyin, Bilibili, WeChat, LinkedIn, Threads, YouTube, Pinterest, and X.

Read the original | Open in WeChat