Save $300K a Year! Intern Uses Rust to Cut TikTok Costs Fast — Developers Say It’d Only Save $300 at a Small Company

Save $300K a Year! Intern Uses Rust to Cut TikTok Costs Fast — Developers Say It’d Only Save $300 at a Small Company

Rust Rewrite Case Study — How a TikTok Intern Saved $300,000/Year

Organized by: Zheng Liyuan

Produced by: CSDN (ID: CSDNnews)

---

Introduction

In developer circles, you may hear:

> "Don’t bother optimizing — computers are fast enough; don’t waste development time for tiny gains."

There’s some truth here — but it’s far from the whole story. In certain high-scale situations, server bills can be huge.

Recently, TikTok intern Xiaoyun Wu shared a striking example: by rewriting part of a core payment service from Go to Rust, he achieved:

  • 2× performance improvement
  • 76% lower latency
  • One-third CPU usage reduction
  • ~$300,000/year savings (~¥2.13M) in cloud costs

For a giant like TikTok, $300K might seem modest — but this was just one module, rewritten by an intern.

image
image

---

Surgical Refactoring — Rewriting Only the Pain Points in Rust

Wu’s blog explains: TikTok LIVE’s payment service was a long-running Go system — stable, concise, great for microservices.

But as TikTok LIVE exploded, symptoms appeared:

  • High CPU usage
  • Frequent stability alerts
  • Continuous scaling to maintain quality
  • Soaring cloud costs

After profiling, the real bottleneck was found:

  • High-frequency APIs for user balance checks and statistics retrieval
  • Computation-heavy & traffic-heavy, dominating CPU time
  • Go-level optimizations maxed out — the language hit its ceiling

Wu followed three precise steps:

---

1. Precise Rewrite — Rust as a Surgical Tool

Rather than rewriting everything, only CPU-intensive endpoints were converted to Rust. All other logic remained in Go.

Why Rust?

  • Near C-level performance
  • Memory safety without garbage collection
  • Ideal for high-concurrency, intense computation

This polyglot architecture preserved Go’s fast development while injecting Rust’s raw speed where it mattered.

> Takeaway: Rust tackles bottlenecks; Go holds the fort.

---

2. Correctness Verification — Fast ≠ Accurate

Performance wins are useless without correct results.

Approach: Run Rust endpoints in shadow mode alongside the production Go service:

  • Feed Rust a live copy of traffic
  • Compare outputs in verification pipeline
  • After weeks with 100% data match, move to load testing

---

3. Load Testing — Push the Limits

Two identical production clusters were built:

  • One running Go endpoints
  • One running Rust endpoints

Testing used 16,000 anonymized real user IDs, increasing load gradually to max capacity while monitoring:

  • QPS
  • Latency
  • CPU usage
  • Memory usage
image

---

Results — Double Throughput, Halved Costs

image
  • Endpoint A — 85K QPS → 150K QPS (~1.8× gain)
  • Endpoint B — 105K QPS → 210K QPS (2× gain)

Impact:

  • Each machine can now handle more traffic
  • Directly retired ~400 vCPU cores
  • Annual savings ~$300K in cloud costs

---

Key Insight — Right Tool for the Right Job

Wu cautions:

> "This is not Rust vs Go. Rust shines in a few bottleneck services; Go’s dev efficiency still wins for 95% of microservices."

image

---

From NUS to TikTok — An Intern’s Growth Path

Wu is a Computer Science major (minor in Statistics) at National University of Singapore, researching programming languages & parallel computing.

Track Record:

  • TikTok Global LIVE Wallet Team (Dec 2024 – Aug 2025)
  • Rust rewrite for payment APIs → $300K savings
  • Built AI-driven Oncall incident analyzer (LLMs + Golang + vector DB) to cut operational workload
  • TikTok LIVE Money Platform (May – Aug 2024)
  • Front-end build optimization
  • Doubled packaging speed
  • Cut CI/CD time 15min → 10min via Rspack migration
  • Volunteer Welfare Org Computing Dept (2023)
  • Ruby on Rails → Go migration
  • Backend response speed ↑ 5×

This blend of academic rigor + hands-on projects prepared him to optimize large-scale distributed systems like TikTok.

image

---

Developer Reactions — "Is Optimization Worth It?"

Wu’s blog sparked Reddit debates — the headline "Intern saves TikTok $300K using Rust" became clickbait fodder.

image

Views diverged:

  • Optimization still matters — scaling multiplies savings
  • Big-company scale only — startups won’t save enough to justify engineering cost

Common takeaway:

Optimization must be timely, targeted, measurable.

> At scale, server cost can surpass people cost — performance becomes a survival issue.

---

Broader Context — Efficiency Beyond Code

For small teams or creators, similar principles apply to content workflows:

Platforms like AiToEarn官网 use AI to generate, publish, and monetize content across:

Douyin, Kwai, WeChat, Bilibili, Xiaohongshu, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, and X.

Strategic automation — whether in code or content — delivers outsized returns when applied at the right time.

---

Summary Lessons for Engineers

  • Profile first — know your real bottlenecks
  • Targeted rewrites beat wholesale migration
  • Verify correctness before scale tests
  • Account for integration costs in multi-language stacks
  • Optimization payoff grows with scale

---

References

---

Would you like me to add a comparison table between Go and Rust for this case to visually highlight why Rust made a difference here? That could make the rewrite even more reader-friendly.

Read more