# Introducing SAG: SQL-Retrieval Augmented Generation
Hello everyone,
I’m **Jomy**, CEO of **Zleap** and the primary designer behind our products and technology.
In earlier media coverage, I briefly introduced Zleap's core technology — **an intelligent Agent** that automatically organizes and summarizes large volumes of internal enterprise data for CEOs.
Today, I’d like to formally present the underlying technology driving this Agent: **SAG**.
---
## Why SAG Matters
Our product launch attracted significant attention.
But after multiple conversations with friends, clients, and investors, I realized **SAG’s potential is far greater** than powering a manager’s all-in-one Agent:
- It can be **foundational technology** for enterprises, individuals, and the wider *Agent ecosystem*.
- It can help advance the AI industry toward **faster, more accurate, and more connected information processing**.
This post explores:
1. **SAG’s technical principles**
2. **Application directions** across enterprise, personal, and AI contexts
---
## What Is SAG?
**SAG** stands for **SQL‑Retrieval Augmented Generation** — meaning the *retrieval* stage is primarily **SQL-driven**.
### Traditional Vector-Based RAG
- **How it works**:
Queries are converted into vectors → vectors are compared in semantic space → the most relevant chunks are sent to the LLM for answering.
- **Strengths**: Efficient.
- **Weaknesses**:
Overly reliant on semantic similarity; struggles with **deep retrieval**.

### Graph-Based RAG (GraphRAG)
- Uses an LLM to parse source text → extract **entities & relationships** → build a **knowledge graph** → answer queries via the graph.
- **Pros**: Allows deeper parsing of relationships.
- **Cons**:
- Slow
- Costly
- Graph must be rebuilt with every data update

---
## How SAG Improves on RAG & GraphRAG
SAG blends:
- **SQL’s precision**
- **Vector’s fuzzy semantic matching**
It builds **data relationships in real time**, breaking the “**Impossible Triangle**” for retrieval — achieving **speed, accuracy, and completeness** simultaneously.
---
### Step 1: Turning Data into “Events”
Like GraphRAG, SAG creates an intermediate representation called **events** — atomic chunks of meaning, similar to how our brains break down complex matters into simpler parts.
**Key Difference:**
SAG does *not* link events during preprocessing. Relationships are computed **on demand** during queries, enabling seamless incremental updates.
---
### Step 2: Adding “Natural Language Vectors”
Each event includes **attributes** — dimensions such as:
- Time
- Location
- Action
- People
These attributes form **natural language vectors**, a human-readable coordinate system of meaning.

- Events share dimensions
- A small model can extract them at low cost
- Dimensions are customizable for different industries

---
### Step 3: Real-Time Relationship Construction
**Analogy: Six Degrees of Separation**
Just as any two people are connected in ≤6 steps, events can be linked through shared attributes.


Because attributes are stored in SQL, queries use **only SQL retrieval**, making real-time linking **extremely fast**.
---
## Precision + Flexibility: SQL Meets Vectors
SQL is great at **exact matching** — but can't handle unpredictable synonyms.
**Example:** Searching “苹果公司” will not match “Apple Inc.” in SQL alone.
**Solution:**
Store all attributes in **both SQL and vector databases**.
- **Vectors** match semantically (“苹果公司” → “Apple Inc.” → “iPhone”)
- **SQL** retrieves exact matches using the expanded set

---
## Why This Matters
- **Vector fuzzy match** + **SQL precision** = best of both worlds.
- **SQL’s table structure** in SAG is extremely simple — LLM *Text-to-SQL* works easily.
- Retrieval becomes:
- **Precise**
- **Complete**
- **Statistically analyzable**
---
## Key Advantage: High-Quality Multi-Hop Retrieval
Traditional RAG often suffers from “**Garbage In, Garbage Out**.”
If the first retrieval is weak, multi-hop reasoning fails.
In **SAG**:
- Each atomic retrieval is **high quality**
- LLMs get *connected events*
- Multi-turn reasoning sees **qualitative leaps**

---
## Enterprise Applications
### 1. Intelligent Decision-Making Assistant
- Awakens dormant historical data
- Connects with real-time business data
- Delivers decision support via reports, search, and Q&A

### 2. General-Purpose Data Processing
- Reconstructs enterprise data for better algorithms:
- E-commerce recommendations
- Financial risk control
- Advertising optimizations

### 3. Low-Cost AI Transformation
- Skips traditional IT stage → enters AI era directly
- Unified data format for deeper applications
- Can use idle compute time or small models
---
## Personal Applications
### 1. Personal Knowledge Base
- Converts notes, docs, and collections into relational knowledge
### 2. Personal AI Memory Core
- Remembers preferences, habits, and conversation history
### 3. Lightweight Deployment
- Can run offline on smartphones
- Keeps data **in your hands**

---
## SAG for Agents
**Context Engineering** is critical, yet most retrieval methods remain simple (e.g., regex search).

SAG provides **structured, high-quality context** in real time.
Better context means reducing steps needed for complex tasks → boosting efficiency & success rates.
---
## Synergy: SAG + AiToEarn
Platforms like **[AiToEarn官网](https://aitoearn.ai/)** allow:
- Open-source AI content monetization
- Unified publishing to:
- Douyin, Kwai, WeChat, Bilibili, Rednote, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X (Twitter)
- Analytics & model rankings ([AI模型排名](https://rank.aitoearn.ai))
Pairing **SAG’s structured retrieval** with **AiToEarn’s publishing tools** connects data → insights → monetized content.
---
## Open Source
We’ve decided to **open-source SAG**:
[https://github.com/Zleap-AI/SAG](https://github.com/Zleap-AI/SAG)
For algorithmic details (event linking, dynamic pruning), see the repository.
We welcome enterprise & academic collaboration.
---
## Vision
SAG is more than a RAG replacement.
It’s about **transforming all information into a unified multi-dimensional knowledge graph**, dissolving data silos, and making data a tradable asset.
---
## Try SAG
We developed a user-facing product for public + private data aggregation:
- Generates periodic reports
- Supports Q&A and search
**Beta Access**: Invitation codes via Zleap’s **official WeChat**
**Web:** [https://app.zleap.com.cn](https://app.zleap.com.cn)
**APP:** Search “Zleap” in iOS App Store
---
**Zleap’s Vision:**
> **Connect all information, turn all data into assets.**