Text-to-SQL

Peking University & Zuoyebang Team Propose New Text-to-SQL Framework Interactive-T2S, Tackling Wide-Table Processing and Low-Resource Alignment Challenges

Honghao Wang

11 Oct 2025 — 3 min read

Transforming LLMs into Intelligent Multi-Turn Database Agents

Instead of using a Large Language Model (LLM) as a one-shot translator that generates complete SQL code, this research turns it into an interactive, multi-round agent capable of iterative problem-solving with a database.

This work, jointly developed by Peking University and Zuoyebang Education Technology (Beijing) Co., Ltd., addresses real-world deployment challenges of LLMs in structured data queries.

It has been accepted at CIKM2025 (34th International Conference on Information and Knowledge Management).

Unlike traditional approaches, the proposed agent operates in repeated Think → Act → Observe cycles — progressively breaking down the problem, fetching relevant data, building queries, and executing them.

This design mitigates inefficiency and resource limits in complex wide-table databases (many columns).

---

📄 Paper Snapshot

Title: Interactive-T2S: Multi-Turn Interactions for Text-to-SQL with Large Language Models
Authors: Guanming Xiong (Peking University), Junwei Bao (Zuoyebang), Hongfei Jiang (Zuoyebang), Yang Song (Zuoyebang), Wen Zhao (Peking University) (corresponding author)
Affiliations: Peking University; Zuoyebang Education Technology (Beijing) Co., Ltd.
Link: arXiv:2408.11062v1

---

Why Is Text-to-SQL Important?

Text-to-SQL bridges natural language and databases, letting users create queries like:

> “List the names of male professors participating in football.”

No SQL syntax knowledge is required.

Key real-world value:

Enterprise operations: Marketing can query performance directly.
Smart education: Students can ask natural questions to query knowledge bases.
Public services: Citizens can quickly check social insurance or housing fund records.

---

Core Deployment Challenges

Low efficiency with wide tables
Feeding all column metadata overwhelms the LLM's context and increases cost/latency.
Poor adaptability in low-resource settings
Reliance on large labeled datasets hurts generalization when data distributions differ.
Limited interpretability
Either skips intermediate reasoning or breaks functionality into too many micro-tools.

---

🎯 Interactive-T2S Framework

A multi-turn, tool-chain approach treating the LLM as an intelligent query agent.

Four Core Tools

SearchColumn – Semantic search for relevant columns, returns statistical features and sample values.
Example: Locate `Faculty.Fname` and `Faculty.Lname` for “professor name.”
SearchValue – BM25 search for cell values in columns/tables.
Example: Find “Soccer” in `Activity.name`.
FindShortestPath – Graph-based shortest path on schema to determine join paths between columns.
Example: Identify relationship chain between professor and activity tables.
ExecuteSQL – Run the LLM-generated SQL, return results, and allow corrections.

---

🌀 Multi-Turn Logic Flow

Steps:

Problem decomposition – Identify needed columns and values.
Information retrieval – `SearchColumn` and `SearchValue` tools fetch data.
Path computation – `FindShortestPath` determines join sequence.
Execution & verification – `ExecuteSQL` runs query and LLM adjusts if needed.

Only two labeled examples are required for effective few-shot learning.

---

🧪 Experimental Results

Tested on Spider, BIRD, Spider-DK, and BIRD-FinC datasets under knowledge-free conditions.

Execution Accuracy (EX):
BIRD-Dev: 54.56% vs previous best 51.69% (+2.87%)
BIRD-FinC: 49.06% vs Zero-shot 31.13% and DIN-SQL 47.17%
Efficiency Gains:
Spider-Dev token usage: 4.6k vs DIN-SQL 12.8k
BIRD-Dev token usage: 4.7k vs 21.6k
Few-Shot Strength:
2-shot EX: 78.7% (Spider-Syn), 80.7% (Spider-Realistic)
Comparable to methods needing 6–7 examples
Multi-Table Join Advantage:
Removing `FindShortestPath` drops EX by 22% (Spider-150) and 12% (BIRD-150) in complex joins.

---

💡 Application Scenarios

Smart Education: Natural language queries for multi-table join insights.
Enterprise Analytics: Handle wide business tables quickly.
Government Data: Allow public to query open datasets effortlessly.

---

🚀 Future Directions

Optimize tool efficiency (faster graph search in `FindShortestPath`)
Extend to multi-modal queries (text + table data)

---

Platforms like AiToEarn use a similar tool-based modular design — but for AI-powered content monetization.

Features:

AI content generation
Simultaneous publishing to Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X (Twitter)
Analytics and AI model rankings: AI模型排名
Open-source: GitHub

---

📅 Conference Highlight

QCon Shanghai: Oct 23–25 — over 95 topics.

Contact ticket manager: 18514549229.

---

📚 Recommended Reading

---

In today’s AI race, speed + adaptability + innovation are crucial moats for startups against tech giants.

Platforms like AiToEarn empower creators with global reach and monetization in this fast-paced era.

Peking University & Zuoyebang Team Propose New Text-to-SQL Framework Interactive-T2S, Tackling Wide-Table Processing and Low-Resource Alignment Challenges

Honghao Wang

Transforming LLMs into Intelligent Multi-Turn Database Agents

📄 Paper Snapshot

Why Is Text-to-SQL Important?

Core Deployment Challenges

🎯 Interactive-T2S Framework

Four Core Tools

🌀 Multi-Turn Logic Flow

🧪 Experimental Results

💡 Application Scenarios

🚀 Future Directions

📅 Conference Highlight

📚 Recommended Reading

Read more

People Stop Buying Porsches, Decade-Long CEO Steps Down

The Cutest New Land Cruiser FJ Launch — Could This Be Equation Leopard’s Long-Lost Brother in Japan?

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. ChatGPT Atlas 发布，AI 浏览器大乱斗...

Express Update | OpenAI’s Japanese Rival Sakana in Talks for Funding at $2.5 Billion Valuation

Transforming LLMs into Intelligent Multi-Turn Database Agents

📄 Paper Snapshot

Why Is Text-to-SQL Important?

Core Deployment Challenges

🎯 Interactive-T2S Framework

Four Core Tools

🌀 Multi-Turn Logic Flow

🧪 Experimental Results

💡 Application Scenarios

🚀 Future Directions

🌐 Related Ecosystem — AiToEarn

📅 Conference Highlight

📚 Recommended Reading

Read more

People Stop Buying Porsches, Decade-Long CEO Steps Down

The Cutest New Land Cruiser FJ Launch — Could This Be Equation Leopard’s Long-Lost Brother in Japan?

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. ChatGPT Atlas 发布，AI 浏览器大乱斗...

Express Update | OpenAI’s Japanese Rival Sakana in Talks for Funding at $2.5 Billion Valuation