If Redis Has 100 Million Keys and 100,000 Share a Known Prefix, How to Find Them All?

Redis Interview Question: Finding Keys by Prefix

This is a classic Redis interview question that tests not only command knowledge but also your understanding of Redis fundamentals, performance implications, and production best practices.

We’ll walk through the answers step-by-step — from the wrong answer, to the standard safe answer, and finally to the bonus architectural answer.

---

Level 1 — Wrong or Risky Answer: `KEYS`

The most direct approach is to run:

KEYS "your_prefix:*"

Why This Is a Wrong Answer

Using `KEYS` in Redis production environments is dangerous because:

  • Single-thread model
  • Redis processes commands on a single thread. When a command executes, all other clients are blocked until completion.
  • Full database traversal
  • The `KEYS` command iterates over every key in the database. This can mean scanning hundreds of millions of keys.
  • Production disaster risk
  • On large datasets, `KEYS` can stall Redis for seconds or minutes, causing timeouts and cascading failures in dependent services.

Conclusion

Only use `KEYS` for debugging or very small datasets. It must never be run against a large-scale production instance.

---

Level 2 — Standard and Safe Answer: `SCAN`

Instead of `KEYS`, use `SCAN` — a non-blocking, incremental cursor-based iteration.

Example usage:

# First scan, starting from cursor 0
SCAN 0 MATCH "your_prefix:*" COUNT 1000

# Redis returns:
# 1) "1762"  <-- Cursor for next scan
# 2) 1) "your_prefix:key1"
#    2) "your_prefix:key2"
#    ...

# Next scan, using prior cursor
SCAN 1762 MATCH "your_prefix:*" COUNT 1000

# Loop until Redis returns cursor "0", meaning completion

Why This Is the Standard Answer

  • Non-blocking
  • Retrieves small batches, freeing Redis for other operations.
  • Cursor-based iteration
  • Continuations are handled by passing returned cursors, avoiding a full scan in one hit.
  • COUNT hint
  • Controls batch size to balance performance and responsiveness (the actual returned amount may vary).

Implementation tip

In application code, loop `SCAN` until the returned cursor is `"0"`. Collect and process results within each iteration.

---

Level 3 — Better Architecture Design (Bonus)

Many interviewers want to see whether you can avoid scanning entirely by designing better data structures.

Approach 1 — Maintain an Index (Set or Hash)

Write time

When saving a new key:

SET "your_prefix:123" "some_value"
SADD "index:your_prefix" "123"

Read time

Retrieve IDs directly from the index:

SMEMBERS "index:your_prefix"

Returns suffixes like `"123"`, `"456"`, etc., with O(N) complexity where `N` = index size, not total DB size.

Delete time

Remove from both DB and index:

DEL "your_prefix:123"
SREM "index:your_prefix" "123"

Pros

  • Extremely fast reads — no scans needed.

Cons

  • Slightly more complex writes/deletes.
  • Extra memory for storing indexes.

---

Approach 2 — Operate on a Replica Node

For low-frequency or offline analytics:

  • Use `SCAN` (or even `KEYS`) on a replica node instead of the master.
  • This isolates costly queries from production traffic, preventing user impact.

---

Interview Answer Summary

When asked this question, respond with:

  • Never use `KEYS` in production — it blocks Redis and can cause outages.
  • Use `SCAN` for temporary needs — non-blocking, safe, iterative scanning.
  • For frequent lookups — maintain an index in a Set or Hash to avoid scanning entirely.

---

Parallel to Modern Content Automation

The incremental, non-blocking approach of `SCAN` is conceptually similar to batch processing in large-scale AI-driven publishing platforms, such as AiToEarn官网.

Like Redis indexes, AiToEarn:

  • Automates content generation
  • Publishes across multiple platforms (Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X/Twitter)
  • Tracks performance with analytics and AI model ranking (AI模型排名)

This is comparable to avoiding full blocking scans in production by designing efficient retrieval and workflow systems.

For deeper insights into scalable workflows, explore:

---

Final takeaway:

✅ Use `SCAN` for safe ad-hoc key listing

✅ Maintain indexes for frequent lookups

✅ Consider workload separation (replica nodes)

✅ Apply similar principles to large-scale workflow automation outside Redis

Read more

How AI Startups Can Effectively Analyze Competitors — Avoid the Feature List Trap and Redefine Your Battleground

How AI Startups Can Effectively Analyze Competitors — Avoid the Feature List Trap and Redefine Your Battleground

Competitive Analysis Is Not “Feature Comparison” — It’s Strategic Positioning This guide explains how AI startup teams can escape the trap of feature lists. Using concepts from user perception, product pacing, and capital narratives, we’ll build a cognitive framework for understanding competitors — and help you identify your differentiated battlefield

By Honghao Wang