Airbnb’s Mussel V2: Next-Generation Key-Value Store Unifying Stream and Batch Processing

Airbnb’s Mussel V2: Next-Generation Key-Value Store Unifying Stream and Batch Processing

Airbnb Introduces Mussel v2: Next-Generation Key-Value Storage

Airbnb’s engineering team has launched Mussel v2, a complete overhaul of its internal key-value engine. This new system is designed to:

  • Unify streaming and bulk ingestion
  • Simplify operations
  • Scale to much larger workloads

Reportedly, Mussel v2 can:

  • Sustain 100,000+ streaming writes per second
  • Support tables beyond 100TB
  • Keep p99 read latency under 25ms
  • Bulk ingest tens of terabytes of data

All of this allows product teams to focus on innovation rather than on pipeline management.

---

Limitations of Mussel v1

The earlier Mussel v1 powered internal Airbnb data services but had key limitations:

  • Static hash partition design on Amazon EC2
  • Managed via Chef scripts
  • Separate batch and streaming paths → higher overhead, harder consistency
  • Increasing difficulty as data volume and integrations grew
image

Mussel V1 architecture (Image source: Airbnb Engineering Blog)

---

Design Innovations in Mussel v2

Mussel v2 integrates a NewSQL backend with a Kubernetes-native control plane, offering:

  • Elasticity of object storage
  • Low-latency caching responsiveness
  • Service mesh manageability

Key Features

  • Dynamic range sharding with pre-splitting to reduce hotspots
  • Namespace-level quotas and dashboards for cost transparency
  • Stateless Dispatcher layer that scales horizontally:
  • Routes API calls
  • Handles retries
  • Supports dual-write & shadow-read for migrations

---

Write and Bulk Ingestion Workflows

  • Writes
  • Persisted to Kafka for durability
  • Processed by Replayer & Write Dispatcher sequentially into the backend
  • Bulk loads
  • Via Airbnb’s data warehouse using Airflow jobs
  • S3 staging area
  • Maintains merge-or-replace semantics

---

Efficient Data Expiration

  • Topology-aware expiration service: shards namespaces into range-based tasks
  • Parallel deletion by multiple workers
  • Minimizes impact on real-time queries
  • For write-heavy tables: max version limits & targeted deletion
image

Mussel V2 architecture (Image source: Airbnb Engineering Blog)

---

Migration from v1 to v2

Migrating was challenging but achieved with zero downtime:

Approach

  • Table-granularity blue-green deployment
  • Continuous validation & rollback
  • Bootstrapping from backups & sampled data
  • Planned pre-splits
  • Checksum validation after bootstrap ingestion
  • Apply lagging Kafka events
  • Enable dual writes before cutover

---

Cutover Process

  • Gradually shift reads to v2
  • Use shadow traffic to monitor consistency
  • Roll back to v1 if error rates spike
  • Kafka remained the common log throughout

---

image

Data migration pipeline from Mussel V1 to V2 (Image source: Airbnb Engineering Blog)

---

Extra Operational Complexity

Migrating from eventual consistency to strong consistency required:

  • Write deduplication
  • Retry control
  • Query tuning & workload distribution
  • Automated rollback and monitoring
  • Per-table staging sequences

Result: 1PB+ migrated with no downtime.

---

Full article:

Airbnb’s Mussel V2: Next-Gen Key Value Storage to Unify Streaming and Bulk Ingestion

---

Broader Context: Scalability Lessons for Content Technology

Advanced architectures like Mussel v2 show the value of scalability and operational simplicity — principles equally important in content creation tech.

For creators aiming to leverage AI to produce, publish, and monetize across multiple platforms, open-source tools such as AiToEarn官网 offer:

  • Integrated AI content generation
  • Cross-platform publishing to Douyin, Kwai, WeChat, Bilibili, Rednote, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, X/Twitter
  • Analytics and AI model rankings
  • A pipeline that is transparent and efficient — paralleling Mussel v2’s goals

---

Would you like me to also create a side-by-side comparison table of Mussel v1 vs Mussel v2 to make the improvements even clearer?

Read more