Zleap Tech Explained: SAG Redefines AI Search in the Post-RAG Era

# Introducing SAG: SQL-Retrieval Augmented Generation

Hello everyone,  

I’m **Jomy**, CEO of **Zleap** and the primary designer behind our products and technology.  

In earlier media coverage, I briefly introduced Zleap's core technology — **an intelligent Agent** that automatically organizes and summarizes large volumes of internal enterprise data for CEOs.  

Today, I’d like to formally present the underlying technology driving this Agent: **SAG**.

---

## Why SAG Matters

Our product launch attracted significant attention.  
But after multiple conversations with friends, clients, and investors, I realized **SAG’s potential is far greater** than powering a manager’s all-in-one Agent:

- It can be **foundational technology** for enterprises, individuals, and the wider *Agent ecosystem*.  
- It can help advance the AI industry toward **faster, more accurate, and more connected information processing**.

This post explores:

1. **SAG’s technical principles**
2. **Application directions** across enterprise, personal, and AI contexts

---

## What Is SAG?

**SAG** stands for **SQL‑Retrieval Augmented Generation** — meaning the *retrieval* stage is primarily **SQL-driven**.

### Traditional Vector-Based RAG

- **How it works**:  
  Queries are converted into vectors → vectors are compared in semantic space → the most relevant chunks are sent to the LLM for answering.
- **Strengths**: Efficient.
- **Weaknesses**:  
  Overly reliant on semantic similarity; struggles with **deep retrieval**.

![image](images/img_001.png)

### Graph-Based RAG (GraphRAG)

- Uses an LLM to parse source text → extract **entities & relationships** → build a **knowledge graph** → answer queries via the graph.
- **Pros**: Allows deeper parsing of relationships.
- **Cons**:
  - Slow
  - Costly
  - Graph must be rebuilt with every data update

![image](images/img_002.png)

---

## How SAG Improves on RAG & GraphRAG

SAG blends:

- **SQL’s precision**
- **Vector’s fuzzy semantic matching**

It builds **data relationships in real time**, breaking the “**Impossible Triangle**” for retrieval — achieving **speed, accuracy, and completeness** simultaneously.

---

### Step 1: Turning Data into “Events”

Like GraphRAG, SAG creates an intermediate representation called **events** — atomic chunks of meaning, similar to how our brains break down complex matters into simpler parts.

**Key Difference:**  
SAG does *not* link events during preprocessing. Relationships are computed **on demand** during queries, enabling seamless incremental updates.

---

### Step 2: Adding “Natural Language Vectors”

Each event includes **attributes** — dimensions such as:

- Time
- Location
- Action
- People

These attributes form **natural language vectors**, a human-readable coordinate system of meaning.

![image](images/img_003.png)

- Events share dimensions  
- A small model can extract them at low cost  
- Dimensions are customizable for different industries

![image](images/img_004.png)

---

### Step 3: Real-Time Relationship Construction

**Analogy: Six Degrees of Separation**  
Just as any two people are connected in ≤6 steps, events can be linked through shared attributes.

![image](images/img_005.png)  
![image](images/img_006.png)

Because attributes are stored in SQL, queries use **only SQL retrieval**, making real-time linking **extremely fast**.

---

## Precision + Flexibility: SQL Meets Vectors

SQL is great at **exact matching** — but can't handle unpredictable synonyms.  

**Example:** Searching “苹果公司” will not match “Apple Inc.” in SQL alone.

**Solution:**  
Store all attributes in **both SQL and vector databases**.

- **Vectors** match semantically (“苹果公司” → “Apple Inc.” → “iPhone”)
- **SQL** retrieves exact matches using the expanded set

![image](images/img_007.png)

---

## Why This Matters

- **Vector fuzzy match** + **SQL precision** = best of both worlds.
- **SQL’s table structure** in SAG is extremely simple — LLM *Text-to-SQL* works easily.
- Retrieval becomes:
  - **Precise**
  - **Complete**
  - **Statistically analyzable**

---

## Key Advantage: High-Quality Multi-Hop Retrieval

Traditional RAG often suffers from “**Garbage In, Garbage Out**.”  
If the first retrieval is weak, multi-hop reasoning fails.

In **SAG**:

- Each atomic retrieval is **high quality**
- LLMs get *connected events*
- Multi-turn reasoning sees **qualitative leaps**

![image](images/img_008.png)

---

## Enterprise Applications

### 1. Intelligent Decision-Making Assistant
- Awakens dormant historical data
- Connects with real-time business data
- Delivers decision support via reports, search, and Q&A

![image](images/img_009.png)

### 2. General-Purpose Data Processing
- Reconstructs enterprise data for better algorithms:
  - E-commerce recommendations
  - Financial risk control
  - Advertising optimizations

![image](images/img_010.png)

### 3. Low-Cost AI Transformation
- Skips traditional IT stage → enters AI era directly
- Unified data format for deeper applications
- Can use idle compute time or small models

---

## Personal Applications

### 1. Personal Knowledge Base
- Converts notes, docs, and collections into relational knowledge

### 2. Personal AI Memory Core
- Remembers preferences, habits, and conversation history

### 3. Lightweight Deployment
- Can run offline on smartphones
- Keeps data **in your hands**

![image](images/img_011.png)

---

## SAG for Agents

**Context Engineering** is critical, yet most retrieval methods remain simple (e.g., regex search).  

![image](images/img_012.png)

SAG provides **structured, high-quality context** in real time.  
Better context means reducing steps needed for complex tasks → boosting efficiency & success rates.

---

## Synergy: SAG + AiToEarn

Platforms like **[AiToEarn官网](https://aitoearn.ai/)** allow:

- Open-source AI content monetization
- Unified publishing to:
  - Douyin, Kwai, WeChat, Bilibili, Rednote, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X (Twitter)
- Analytics & model rankings ([AI模型排名](https://rank.aitoearn.ai))

Pairing **SAG’s structured retrieval** with **AiToEarn’s publishing tools** connects data → insights → monetized content.

---

## Open Source

We’ve decided to **open-source SAG**:  
[https://github.com/Zleap-AI/SAG](https://github.com/Zleap-AI/SAG)

For algorithmic details (event linking, dynamic pruning), see the repository.  
We welcome enterprise & academic collaboration.

---

## Vision

SAG is more than a RAG replacement.  
It’s about **transforming all information into a unified multi-dimensional knowledge graph**, dissolving data silos, and making data a tradable asset.

---

## Try SAG

We developed a user-facing product for public + private data aggregation:

- Generates periodic reports
- Supports Q&A and search

**Beta Access**: Invitation codes via Zleap’s **official WeChat**

**Web:** [https://app.zleap.com.cn](https://app.zleap.com.cn)  
**APP:** Search “Zleap” in iOS App Store

---

**Zleap’s Vision:**  
> **Connect all information, turn all data into assets.**

Read more

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Harvard CS50: Introduction to Programming with R Harvard University offers exceptional beginner-friendly computer science courses. We’re excited to announce the release of Harvard CS50’s Introduction to Programming in R, a powerful language widely used for statistical computing, data science, and graphics. This course was developed by Carter Zenke.