This Popular Method Is Actually a Big Trap! How to Solve the Redis Big Key Problem?

# Golang in Practice
## An Offline Redis Large Key Parsing Solution Based on RDB Files
Large keys in Redis can act as hidden bombs that threaten **cluster stability**.
After reviewing Redis large-key analysis methods—including `SCAN`-based discovery and the official `redis-cli --bigkeys`—I ultimately chose an **offline RDB-based parsing** approach.
**Advantages:**
- No impact on online clusters
- Precise memory usage statistics for all keys
- Ideal for **non-real-time** detection (for real-time, use slow query systems)
Recently, I implemented this approach in **Golang**.
This guide details the **requirements, technical design, and production deployment**.
---
## 1. Why Build This Tool?
Existing tools (like Python’s `redis-rdb-tools`) have two key drawbacks:
1. **Performance bottlenecks**
Parsing large RDB files (e.g., 20GB) can take ~1 hour. In Ops, we often need **batch parsing across clusters**, so speed matters.
2. **Insufficient customization**
Most output in generic formats that require extra scripting to integrate with internal Ops tools (e.g., classifying large keys by business line, syncing to Grafana dashboards).
**Why Golang?**
- Concurrency advantages
- High performance
- Standalone binary deployment via compilation
**Goals:** **Fast parsing**, **customizable output**, **easy integration**.
---
## 2. Golang RDB Parsing Approach
### Understand RDB Binary Format
Redis RDB files store:
- Database selection markers
- Key-value data
- Key expiration times
Parsing steps:
---
### Step 1 — Choose a Parsing Library
Manually writing an RDB parser means handling multiple data types and compression formats—time-consuming.
In the Golang ecosystem:
I chose [`github.com/HDT3213/rdb`](https://github.com/HDT3213/rdb)
- **Lightweight**
- **Well-documented**
- Supports Redis 6.0+ RDB format
**Note:** This library didn’t meet all export requirements → I **forked & patched** it, using `replace` directives to point dependencies to my custom version.
---
## 3. Workflow: RDB → Large Key Detection
**High-level process:**
1. Backup RDB files
2. Read RDB files
3. Parse key-value + memory usage
4. Filter by size threshold
5. Output results
---
### **Step 1**: Read RDB & Initialize Parserfunc MyFindBiggestKeys(rdbFilename string, output chan<- RedisData, options ...interface{}) error {
if rdbFilename == "" {
return errors.New("src file path is required")
}
rdbFile, err := os.Open(rdbFilename)
if err != nil {
return fmt.Errorf("open rdb %s failed, %v", rdbFilename, err)
}
defer rdbFile.Close()
var dec decoder = core.NewDecoder(rdbFile)
if dec, err = wrapDecoder(dec, options...); err != nil {
return err
}
err = dec.Parse(func(object model.RedisObject) bool {
data := RedisData{ Data: object }
select {
case output <- data:
return true
case <-time.After(5 * time.Second):
err = errors.New("send to output channel timeout")
return false
}
})
if err != nil {
return fmt.Errorf("parse rdb failed: %w", err)
}
return nil
}
**Key:** Uses a channel to stream data from multiple RDB files in parallel.
---
### **Step 2**: Parse & Identify Large Keysfor data := range ch {
if data.Err != nil {
slog.Error("data error", "error", data.Err)
continue
}
if uint64(data.Data.GetSize()) <= *size {
continue
}
// Store or process large key
}
**Logic:** Compare parsed key size against the configured threshold.
---
### **Step 3**: Output Results
Insert large key records into a database →
Integrate with Grafana or Ops dashboard to visualize and track trends.
rediskey := &models.RedisKey{
JobID: jobID,
RedisName: redisName,
Key: data.Data.GetKey(),
Type: data.Data.GetType(),
Size: int64(data.Data.GetSize()),
CreatedAt: time.Now(),
}
if err := b.ResultV1().CreateTaskResult(ctx, rediskey); err != nil {
slog.Error("operation failed", "err", err, "key", data.Data.GetKey())
}
---
## 4. Performance Optimization
### Optimization 1 — Concurrent Parsing
Parse per Redis shard in separate goroutines; use `sync.WaitGroup` for completion control.
### Optimization 2 — Reduce Memory Usage
Avoid holding all parsed keys in RAM:
- **Stream parsing:** send large keys directly to channel/DB as they’re found.
---
## 5. Engineering the Tool
### **Compile to Binary**
Example Makefile targets:run:
@echo "Running program ..."
go run $(MAIN_FILE) -c configs/rdb-server.yaml
build-linux:
GOOS=linux GOARCH=amd64 go build -o $(OUTPUT_DIR)/tool-linux-amd64 $(MAIN_FILE)
build-windows:
GOOS=windows GOARCH=amd64 go build -o $(OUTPUT_DIR)/tool-windows-amd64.exe $(MAIN_FILE)
Run:./rdb-bigkey-linux-amd64 -c configs/rdb-server.yaml
---
### **Scheduled Tasks**
- **Cron Daily Off-peak:** Backup RDB files, store results, auto-trigger parsing.
---
### **Integration**
Use Grafana for visualization:
- Daily large key count trends
- Distribution by business line
---
## 6. Pitfalls & Solutions
**Pitfall 1 — RDB Format Incompatibility**
- Issue: Redis <6.0 files failed to parse.
- Solution: Use multi-version library or enhance parser.
**Pitfall 2 — Excessive Memory Usage**
- Issue: Parsing 20GB RDB consumed >40GB RAM.
- Solution: Enable streaming parsing and free memory after each DB parse.
---
## 7. Summary & Roadmap
**Results:**
- Parsing speed ×3–5 faster than Python tools
- 20GB RDB parsed in ~15 min (vs ~60 min before)
- Supports custom thresholds & output formats
**Planned Features:**
- Auto business-line classification by key prefix
- Large key growth trend alerts

---
**Next Steps:**
If you’re also tackling Redis large key governance, use this workflow as a base.
Optimize, integrate with monitoring, and avoid production load impact.


[Read Original](2247633073)
[Open in WeChat](https://wechat2rss.bestblogs.dev/link-proxy/?k=5f6d1031&r=1&u=https%3A%2F%2Fmp.weixin.qq.com%2Fs%3F__biz%3DMzkzMjYzNjkzNw%3D%3D%26mid%3D2247633073%26idx%3D1%26sn%3D1d06bce5046a4de4fe95d71a32567329)