How Nubank Built an Internal Logging Platform to Process 1 Trillion Logs per Day
Crane Worldwide Logistics Joins Streamfest 2025 — Hear Their Story (Sponsored)
Read the full announcement here

---
Event Overview
Join us on November 5–6 for Redpanda Streamfest, a two-day online event exploring the future of streaming data and real-time AI.
Featured Speaker:
Jared Noynaert, VP of Engineering at Crane Worldwide Logistics, will discuss:
- Fundamentals of modern data infrastructure.
- Key topics: isolation, auto-scaling, branching, and serverless models.
- How Redpanda and the broader Kafka ecosystem fit into next-gen architectures.
Event Highlights:
- Forward-looking keynotes.
- Live demos.
- Real-world case studies.
---
> Disclaimer: The technical analysis in this article is based on publicly shared information from the Nubank Engineering Team. All credit goes to them. If you spot inaccuracies, please leave a comment so we can correct it.
---
Why Logging Infrastructure Struggles During Rapid Growth
When companies scale quickly, systems often hit operational limits.
Nubank — one of the largest digital banks in the world — faced exactly that scenario with its logging platform.
Logging Challenges at Nubank
- Relied on an external vendor for log ingestion and storage.
- Limited visibility into how logs were collected and stored.
- Rising costs with unpredictable future spending.
- Alerting and dashboards tied tightly to the vendor’s ecosystem.
- Spikes in log ingestion slowed query performance, impacting incident response.
These factors pushed Nubank to build an in-house logging platform to regain control, cut costs, and improve reliability.
---
The Initial Logging Architecture
Initially, every application sent logs directly to the vendor’s API or forwarder.

Problems:
- No filtering or routing → unnecessary low-value logs increased processing costs.
- Blind spots → limited troubleshooting visibility when vendor processes failed.
- Rapid cost growth → scaling required paying more.
- Vendor lock-in → difficult to change ingestion, storage, or querying.
---
Nubank’s Two-Phase Platform Strategy
Instead of an all-at-once rebuild, Nubank opted for a two-phase approach:
- Observability Stream (Ingestion Pipeline)
- Focused on collecting, buffering, and processing logs.
- Enabled filtering, transformations, and metrics collection.
- Query & Storage Platform
- Designed for fast searches across petabytes of logs.
- Optimized for cost-effective, scalable storage.
Guiding Principles:
- Reliability — handle spikes without failure.
- Scalability — support bursts and sustained growth.
- Cost efficiency — cheaper than vendor solutions with full transparency.
This strategy mirrors best practices seen in both engineering and AI content workflows, where generation and publishing are decoupled for scalability.
---
Phase One: Ingestion Pipeline

Components:
- Fluent Bit (Open Source) — lightweight, configurable log forwarder.
- Data Buffer Service (In-House) — micro-batches logs to absorb spikes.
- Filter & Process Service (In-House) — scalable, extensible layer for data enrichment and health metrics.
Benefits:
- Decouples ingestion from querying.
- Handles surges gracefully.
- Adds system visibility through real-time metrics.
---
Phase Two: Query & Storage Platform
Query Engine — Trino
- Distributed SQL engine.
- Supports partitioning → faster queries by scanning only relevant data.
- Flexible integrations with multiple backends.
Storage Layer — AWS S3
- High durability and availability.
- Petabyte-scale capacity.
- Cost-effective for long-term retention.

---
Data Format — Parquet
- Columnar storage for efficient querying.
- ~95% compaction rate → lower storage requirements.
- Excellent scan performance with compression benefits.
Parquet Generator (In-House)

- High-throughput transformation of batches into Parquet.
- Fully controlled for cost optimization.
- Scalable and extensible for future needs.
---
Performance & Scale Metrics (Mid-2024)
- 1 trillion logs/day ingested.
- 1 petabyte/day processed.
- 45-day retention → ~45 PB stored.
- 15,000 queries/day scanning ~150 PB of data.
- 50% lower costs compared to vendor solution.
---
Key Takeaways
Nubank’s success came from:
- Decoupling ingestion from querying.
- Micro-batching for resilience.
- Using Parquet + AWS S3 for efficient, scalable storage.
- Leveraging Trino for fast distributed queries.
- Building in-house services for full operational control.
This approach ensures predictable costs, high scalability, and strong visibility — valuable both in engineering infrastructures and in multi-platform AI content publishing ecosystems.
---
References
---
Sponsorship Opportunity
Reach 1M+ tech professionals — including senior engineers and decision-makers.
Reserve your space today:
Email `sponsorship@bytebytego.com` (slots sell out ~4 weeks in advance).
For broader multi-channel reach, explore tools like AiToEarn — an open-source AI content monetization platform supporting Douyin, Kwai, WeChat, Bilibili, Rednote, Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, and X (Twitter), integrating generation, publishing, analytics, and monetization.
---
Do you want me to also create a side-by-side “before vs. after” architecture diagram section for Nubank’s logging transformation? This could make the rewrite even more visually and structurally clear.