This Popular Method Is Actually a Big Trap! How to Solve the Redis Big Key Problem?

This Popular Method Is Actually a Big Trap! How to Solve the Redis Big Key Problem?
![image](https://blog.aitoearn.ai/content/images/2025/11/img_001-311.jpg)

# Golang in Practice  
## An Offline Redis Large Key Parsing Solution Based on RDB Files

Large keys in Redis can act as hidden bombs that threaten **cluster stability**.  
After reviewing Redis large-key analysis methods—including `SCAN`-based discovery and the official `redis-cli --bigkeys`—I ultimately chose an **offline RDB-based parsing** approach.  

**Advantages:**
- No impact on online clusters
- Precise memory usage statistics for all keys
- Ideal for **non-real-time** detection (for real-time, use slow query systems)

Recently, I implemented this approach in **Golang**.  
This guide details the **requirements, technical design, and production deployment**.

---

## 1. Why Build This Tool?

Existing tools (like Python’s `redis-rdb-tools`) have two key drawbacks:

1. **Performance bottlenecks**  
   Parsing large RDB files (e.g., 20GB) can take ~1 hour. In Ops, we often need **batch parsing across clusters**, so speed matters.

2. **Insufficient customization**  
   Most output in generic formats that require extra scripting to integrate with internal Ops tools (e.g., classifying large keys by business line, syncing to Grafana dashboards).

**Why Golang?**  
- Concurrency advantages
- High performance
- Standalone binary deployment via compilation

**Goals:** **Fast parsing**, **customizable output**, **easy integration**.

---

## 2. Golang RDB Parsing Approach

### Understand RDB Binary Format
Redis RDB files store:
- Database selection markers  
- Key-value data  
- Key expiration times  

Parsing steps:

---

### Step 1 — Choose a Parsing Library
Manually writing an RDB parser means handling multiple data types and compression formats—time-consuming.

In the Golang ecosystem:  
I chose [`github.com/HDT3213/rdb`](https://github.com/HDT3213/rdb)  
- **Lightweight**
- **Well-documented**
- Supports Redis 6.0+ RDB format

**Note:** This library didn’t meet all export requirements → I **forked & patched** it, using `replace` directives to point dependencies to my custom version.

---

## 3. Workflow: RDB → Large Key Detection

**High-level process:**
1. Backup RDB files
2. Read RDB files
3. Parse key-value + memory usage
4. Filter by size threshold
5. Output results

---

### **Step 1**: Read RDB & Initialize Parser

func MyFindBiggestKeys(rdbFilename string, output chan<- RedisData, options ...interface{}) error {

if rdbFilename == "" {

return errors.New("src file path is required")

}

rdbFile, err := os.Open(rdbFilename)

if err != nil {

return fmt.Errorf("open rdb %s failed, %v", rdbFilename, err)

}

defer rdbFile.Close()

var dec decoder = core.NewDecoder(rdbFile)

if dec, err = wrapDecoder(dec, options...); err != nil {

return err

}

err = dec.Parse(func(object model.RedisObject) bool {

data := RedisData{ Data: object }

select {

case output <- data:

return true

case <-time.After(5 * time.Second):

err = errors.New("send to output channel timeout")

return false

}

})

if err != nil {

return fmt.Errorf("parse rdb failed: %w", err)

}

return nil

}

**Key:** Uses a channel to stream data from multiple RDB files in parallel.

---

### **Step 2**: Parse & Identify Large Keys

for data := range ch {

if data.Err != nil {

slog.Error("data error", "error", data.Err)

continue

}

if uint64(data.Data.GetSize()) <= *size {

continue

}

// Store or process large key

}

**Logic:** Compare parsed key size against the configured threshold.

---

### **Step 3**: Output Results
Insert large key records into a database →  
Integrate with Grafana or Ops dashboard to visualize and track trends.

rediskey := &models.RedisKey{

JobID: jobID,

RedisName: redisName,

Key: data.Data.GetKey(),

Type: data.Data.GetType(),

Size: int64(data.Data.GetSize()),

CreatedAt: time.Now(),

}

if err := b.ResultV1().CreateTaskResult(ctx, rediskey); err != nil {

slog.Error("operation failed", "err", err, "key", data.Data.GetKey())

}


---

## 4. Performance Optimization

### Optimization 1 — Concurrent Parsing
Parse per Redis shard in separate goroutines; use `sync.WaitGroup` for completion control.

### Optimization 2 — Reduce Memory Usage
Avoid holding all parsed keys in RAM:
- **Stream parsing:** send large keys directly to channel/DB as they’re found.

---

## 5. Engineering the Tool

### **Compile to Binary**
Example Makefile targets:

run:

@echo "Running program ..."

go run $(MAIN_FILE) -c configs/rdb-server.yaml

build-linux:

GOOS=linux GOARCH=amd64 go build -o $(OUTPUT_DIR)/tool-linux-amd64 $(MAIN_FILE)

build-windows:

GOOS=windows GOARCH=amd64 go build -o $(OUTPUT_DIR)/tool-windows-amd64.exe $(MAIN_FILE)


Run:

./rdb-bigkey-linux-amd64 -c configs/rdb-server.yaml


---

### **Scheduled Tasks**
- **Cron Daily Off-peak:** Backup RDB files, store results, auto-trigger parsing.

---

### **Integration**
Use Grafana for visualization:  
- Daily large key count trends  
- Distribution by business line

---

## 6. Pitfalls & Solutions

**Pitfall 1 — RDB Format Incompatibility**  
- Issue: Redis <6.0 files failed to parse.  
- Solution: Use multi-version library or enhance parser.

**Pitfall 2 — Excessive Memory Usage**  
- Issue: Parsing 20GB RDB consumed >40GB RAM.  
- Solution: Enable streaming parsing and free memory after each DB parse.

---

## 7. Summary & Roadmap

**Results:**  
- Parsing speed ×3–5 faster than Python tools  
- 20GB RDB parsed in ~15 min (vs ~60 min before)  
- Supports custom thresholds & output formats

**Planned Features:**
- Auto business-line classification by key prefix
- Large key growth trend alerts

![image](images/img_002.jpg)

---

**Next Steps:**  
If you’re also tackling Redis large key governance, use this workflow as a base.  
Optimize, integrate with monitoring, and avoid production load impact.

![image](https://blog.aitoearn.ai/content/images/2025/11/img_003-279.jpg)  
![image](https://blog.aitoearn.ai/content/images/2025/11/img_004-264.jpg)  

[Read Original](2247633073)  
[Open in WeChat](https://wechat2rss.bestblogs.dev/link-proxy/?k=5f6d1031&r=1&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%3F__biz%3DMzkzMjYzNjkzNw%3D%3D%26mid%3D2247633073%26idx%3D1%26sn%3D1d06bce5046a4de4fe95d71a32567329)

Read more

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Harvard CS50: Introduction to Programming with R Harvard University offers exceptional beginner-friendly computer science courses. We’re excited to announce the release of Harvard CS50’s Introduction to Programming in R, a powerful language widely used for statistical computing, data science, and graphics. This course was developed by Carter Zenke.