Nubank

How Nubank Built an Internal Logging Platform to Process 1 Trillion Logs per Day

Honghao Wang

28 Oct 2025 — 3 min read

Crane Worldwide Logistics Joins Streamfest 2025 — Hear Their Story (Sponsored)

---

Event Overview

Join us on November 5–6 for Redpanda Streamfest, a two-day online event exploring the future of streaming data and real-time AI.

Featured Speaker:

Jared Noynaert, VP of Engineering at Crane Worldwide Logistics, will discuss:

Fundamentals of modern data infrastructure.
Key topics: isolation, auto-scaling, branching, and serverless models.
How Redpanda and the broader Kafka ecosystem fit into next-gen architectures.

Event Highlights:

Forward-looking keynotes.
Live demos.
Real-world case studies.

Sign Up Now

---

> Disclaimer: The technical analysis in this article is based on publicly shared information from the Nubank Engineering Team. All credit goes to them. If you spot inaccuracies, please leave a comment so we can correct it.

---

Why Logging Infrastructure Struggles During Rapid Growth

When companies scale quickly, systems often hit operational limits.

Nubank — one of the largest digital banks in the world — faced exactly that scenario with its logging platform.

Logging Challenges at Nubank

Relied on an external vendor for log ingestion and storage.
Limited visibility into how logs were collected and stored.
Rising costs with unpredictable future spending.
Alerting and dashboards tied tightly to the vendor’s ecosystem.
Spikes in log ingestion slowed query performance, impacting incident response.

These factors pushed Nubank to build an in-house logging platform to regain control, cut costs, and improve reliability.

---

The Initial Logging Architecture

Initially, every application sent logs directly to the vendor’s API or forwarder.

Problems:

No filtering or routing → unnecessary low-value logs increased processing costs.
Blind spots → limited troubleshooting visibility when vendor processes failed.
Rapid cost growth → scaling required paying more.
Vendor lock-in → difficult to change ingestion, storage, or querying.

---

Nubank’s Two-Phase Platform Strategy

Instead of an all-at-once rebuild, Nubank opted for a two-phase approach:

Observability Stream (Ingestion Pipeline)
Focused on collecting, buffering, and processing logs.
Enabled filtering, transformations, and metrics collection.
Query & Storage Platform
Designed for fast searches across petabytes of logs.
Optimized for cost-effective, scalable storage.

Guiding Principles:

Reliability — handle spikes without failure.
Scalability — support bursts and sustained growth.
Cost efficiency — cheaper than vendor solutions with full transparency.

This strategy mirrors best practices seen in both engineering and AI content workflows, where generation and publishing are decoupled for scalability.

---

Phase One: Ingestion Pipeline

Components:

Fluent Bit (Open Source) — lightweight, configurable log forwarder.
Data Buffer Service (In-House) — micro-batches logs to absorb spikes.
Filter & Process Service (In-House) — scalable, extensible layer for data enrichment and health metrics.

Benefits:

Decouples ingestion from querying.
Handles surges gracefully.
Adds system visibility through real-time metrics.

---

Phase Two: Query & Storage Platform

Query Engine — Trino

Distributed SQL engine.
Supports partitioning → faster queries by scanning only relevant data.
Flexible integrations with multiple backends.

Storage Layer — AWS S3

High durability and availability.
Petabyte-scale capacity.
Cost-effective for long-term retention.

---

Data Format — Parquet

Columnar storage for efficient querying.
~95% compaction rate → lower storage requirements.
Excellent scan performance with compression benefits.

Parquet Generator (In-House)

High-throughput transformation of batches into Parquet.
Fully controlled for cost optimization.
Scalable and extensible for future needs.

---

Performance & Scale Metrics (Mid-2024)

1 trillion logs/day ingested.
1 petabyte/day processed.
45-day retention → ~45 PB stored.
15,000 queries/day scanning ~150 PB of data.
50% lower costs compared to vendor solution.

---

Key Takeaways

Nubank’s success came from:

Decoupling ingestion from querying.
Micro-batching for resilience.
Using Parquet + AWS S3 for efficient, scalable storage.
Leveraging Trino for fast distributed queries.
Building in-house services for full operational control.

This approach ensures predictable costs, high scalability, and strong visibility — valuable both in engineering infrastructures and in multi-platform AI content publishing ecosystems.

---

References

How Nubank Built Its In-House Log Platform

---

Sponsorship Opportunity

Reach 1M+ tech professionals — including senior engineers and decision-makers.

Reserve your space today:

Email `sponsorship@bytebytego.com` (slots sell out ~4 weeks in advance).

For broader multi-channel reach, explore tools like AiToEarn — an open-source AI content monetization platform supporting Douyin, Kwai, WeChat, Bilibili, Rednote, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, and X (Twitter), integrating generation, publishing, analytics, and monetization.

---

Do you want me to also create a side-by-side “before vs. after” architecture diagram section for Nubank’s logging transformation? This could make the rewrite even more visually and structurally clear.