Growth & SEO

How Disney Hotstar (Now JioHotstar) Scaled Its Infrastructure to Support 60 Million Concurrent Users

Honghao Wang

19 Nov 2025 — 3 min read

Disclaimer

The details in this post are based on publicly shared information by the Disney+ Hotstar (now JioHotstar) Engineering Team.

All technical credit belongs to them. Original articles and sources can be found in the References section at the end.

Our analysis builds upon their shared insights — if you find inaccuracies or omissions, please leave a comment for revisions.

---

Overview: A Record-Scale Streaming Challenge

In 2023, Disney+ Hotstar tackled one of the most ambitious engineering challenges in streaming history — supporting

50–60 million concurrent live streams during the Asia Cup and Cricket World Cup.

For perspective, earlier Hotstar peaks were around 25 million concurrent users handled on two self-managed Kubernetes clusters.

---

“Free on Mobile” — Raising the Stakes

Adding complexity, the platform introduced Free on Mobile streaming, removing subscription requirements for millions.

This drastically increased expected load and forced a fundamental infrastructure rethink.

---

Why Architecture Needed to Change

Simply adding more servers wasn’t viable. The team needed structural change to maintain:

Reliability
High speed
Efficiency under unprecedented demand

---

Introducing the “X Architecture”

The solution was a server-driven “X architecture”, engineered for:

Flexibility
Scalability
Cost-effectiveness
Global operational efficiency

This reduced excessive dependence on brute-force scaling.

---

> Related for content creators:

If you operate at scale across multiple platforms, AiToEarn官网 offers an open-source AI monetization stack — integrating creation tools, cross-platform publishing, analytics, and model rankings — similar to how streaming teams integrate their operational layers for efficiency.

---

Technical Journey — From Gateways to Kubernetes Overhaul

The overhaul included:

Network redesign
API gateway optimization
Migration to Amazon EKS
Data Center Abstraction
...all to keep millions of streams running smoothly in real time.

---

Architecture Overview

Disney+ Hotstar serves mobile (Android/iOS), web, and smart TV users. Traffic flows:

User action → request sent to external API gateway via CDN
CDN layer → performs security + traffic filtering → forwards to internal API gateway
Internal gateway (ALBs) → load-balances across backend services
Backend services → deliver features (video, scorecards, chat, profiles) using managed or self-hosted databases

Key point: Each layer must scale together. Any bottleneck affects millions of users.

---

CDN Optimization

During scale-up, CDNs did more than caching: they also acted as API gateways with rate limiting and token verification.

Findings from past events:

Top 10 APIs dominated load
API requests split into:
Cacheable — data stable for seconds/minutes (scoreboards, highlights)
Non-cacheable — personalized/time-sensitive

Solution:

Separate domains for cacheable APIs
Lighter security + faster routing for cached data
→ Reduced edge server load and boosted throughput.

---

Traffic Frequency Tweaks

By reducing refresh intervals for certain features (e.g., “Watch More” suggestions), overall traffic dropped without reducing user experience quality.

---

Infrastructure Scaling Layers

Two main focus areas:

1. NAT Gateway Scaling

Problem: Imbalanced bandwidth use — one AZ’s NAT gateway overloaded
Fix: Shift from 1 per AZ to 1 per subnet, distributing load more evenly

2. Kubernetes Worker Node Tuning

Problem: Internal API Gateway pods hogging bandwidth (8–9 Gbps on single node)
Fixes:
Use 10 Gbps-capable nodes
Apply topology spread constraints — one gateway pod per node

---

Migration to Amazon EKS

Reason: Self-managed clusters couldn’t reliably scale beyond ~25M users. Control plane fragility under high load.

Benefits of EKS:

AWS-managed control plane
Focus engineering on workloads and data plane
Latest instance families (e.g., Graviton)
Faster cluster provisioning

---

Avoid API server throttling by stepwise scaling (100–300 nodes per step)
Stable test results with >400 nodes

---

The Next Phase — Beyond 50M Users

Legacy constraints:

Port exhaustion (NodePort range)
Older Kubernetes version blocking modern tooling
IP address limits in VPC subnets
Manual load balancer pre-warming

---

Datacenter Abstraction Architecture

Concept: Logical “data center” = multiple Kubernetes clusters in a region functioning as one.

Benefits:

Failover between clusters without redeployment complexity
Uniform scaling + observability
Central routing + security
Reduced operational overhead

---

Key Innovations

Central Proxy Layer — Envoy-based Internal API Gateway

Unified fleet replaced >200 ALBs
Routing, authentication, rate-limiting centralized
Developers cluster-location agnostic

EKS Across All Clusters

Unified orchestration via managed Kubernetes
Access to newest EC2 types
Rapid, on-demand provisioning

---

Unified Service Endpoints

Consistent internal pattern:
`service.internal.domain` or `service.internal.dc.domain`
External/public pattern:
`service.public-domain`

---

One Unified Manifest

Single config template with overrides for environment-specific tweaks
Load balancer/DNS/security abstracted by platform

---

NodePort Eliminated

ClusterIP services with AWS ALB Ingress Controller
Removed port range limits
Simplified traffic flow and network scaling

---

Final Transformation Impact

Production environment:

Six balanced EKS clusters
Only one P0-critical service per cluster
Central Envoy gateway replaced hundreds of ALBs
>200 microservices migrated into Datacenter Abstraction setup
During the 2023 Cricket World Cup:
61M concurrent users
No major incidents or disruptions

---

References

---

Takeaway

Hotstar’s success reveals three timeless lessons:

Architecture > Raw hardware scaling
Unified orchestration simplifies ops
Stepwise scaling & abstraction prevent bottlenecks

---

Would you like me to produce a visual summary diagram of the Datacenter Abstraction and scaling flow as part of this rewrite? That way your Markdown will have both improved readability and clearer technical illustration.