If Redis Has 100 Million Keys and 100,000 Share a Known Prefix, How to Find Them All?
Redis Interview Question: Finding Keys by Prefix
This is a classic Redis interview question that tests not only command knowledge but also your understanding of Redis fundamentals, performance implications, and production best practices.
We’ll walk through the answers step-by-step — from the wrong answer, to the standard safe answer, and finally to the bonus architectural answer.
---
Level 1 — Wrong or Risky Answer: `KEYS`
The most direct approach is to run:
KEYS "your_prefix:*"Why This Is a Wrong Answer
Using `KEYS` in Redis production environments is dangerous because:
- Single-thread model
- Redis processes commands on a single thread. When a command executes, all other clients are blocked until completion.
- Full database traversal
- The `KEYS` command iterates over every key in the database. This can mean scanning hundreds of millions of keys.
- Production disaster risk
- On large datasets, `KEYS` can stall Redis for seconds or minutes, causing timeouts and cascading failures in dependent services.
Conclusion
Only use `KEYS` for debugging or very small datasets. It must never be run against a large-scale production instance.
---
Level 2 — Standard and Safe Answer: `SCAN`
Instead of `KEYS`, use `SCAN` — a non-blocking, incremental cursor-based iteration.
Example usage:
# First scan, starting from cursor 0
SCAN 0 MATCH "your_prefix:*" COUNT 1000
# Redis returns:
# 1) "1762" <-- Cursor for next scan
# 2) 1) "your_prefix:key1"
# 2) "your_prefix:key2"
# ...
# Next scan, using prior cursor
SCAN 1762 MATCH "your_prefix:*" COUNT 1000
# Loop until Redis returns cursor "0", meaning completionWhy This Is the Standard Answer
- Non-blocking
- Retrieves small batches, freeing Redis for other operations.
- Cursor-based iteration
- Continuations are handled by passing returned cursors, avoiding a full scan in one hit.
- COUNT hint
- Controls batch size to balance performance and responsiveness (the actual returned amount may vary).
Implementation tip
In application code, loop `SCAN` until the returned cursor is `"0"`. Collect and process results within each iteration.
---
Level 3 — Better Architecture Design (Bonus)
Many interviewers want to see whether you can avoid scanning entirely by designing better data structures.
Approach 1 — Maintain an Index (Set or Hash)
Write time
When saving a new key:
SET "your_prefix:123" "some_value"
SADD "index:your_prefix" "123"Read time
Retrieve IDs directly from the index:
SMEMBERS "index:your_prefix"Returns suffixes like `"123"`, `"456"`, etc., with O(N) complexity where `N` = index size, not total DB size.
Delete time
Remove from both DB and index:
DEL "your_prefix:123"
SREM "index:your_prefix" "123"Pros
- Extremely fast reads — no scans needed.
Cons
- Slightly more complex writes/deletes.
- Extra memory for storing indexes.
---
Approach 2 — Operate on a Replica Node
For low-frequency or offline analytics:
- Use `SCAN` (or even `KEYS`) on a replica node instead of the master.
- This isolates costly queries from production traffic, preventing user impact.
---
Interview Answer Summary
When asked this question, respond with:
- Never use `KEYS` in production — it blocks Redis and can cause outages.
- Use `SCAN` for temporary needs — non-blocking, safe, iterative scanning.
- For frequent lookups — maintain an index in a Set or Hash to avoid scanning entirely.
---
Parallel to Modern Content Automation
The incremental, non-blocking approach of `SCAN` is conceptually similar to batch processing in large-scale AI-driven publishing platforms, such as AiToEarn官网.
Like Redis indexes, AiToEarn:
- Automates content generation
- Publishes across multiple platforms (Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X/Twitter)
- Tracks performance with analytics and AI model ranking (AI模型排名)
This is comparable to avoiding full blocking scans in production by designing efficient retrieval and workflow systems.
For deeper insights into scalable workflows, explore:
---
Final takeaway:
✅ Use `SCAN` for safe ad-hoc key listing
✅ Maintain indexes for frequent lookups
✅ Consider workload separation (replica nodes)
✅ Apply similar principles to large-scale workflow automation outside Redis