Refactoring a Class Saves 2.9G Memory in the JVM?

Refactoring a Class Saves 2.9G Memory in the JVM?

Data Structure Refactoring That Saved Nearly 3 GB JVM Heap Memory in Production

This article explains how we refactored a core class’s data structure from

`HashMap>` → FastUtil `Long2ObjectOpenHashMap`.

By leveraging business data characteristics—tags as small integers, sparsely distributed—we replaced `HashSet` with sorted `int[]` arrays plus binary search lookups.

Key Results

  • Heap drop: ~3.2 GB → 211 MB (↓ 94% memory usage)
  • Validated in production with dual-write checks before rollout
  • No performance trade-off in lookups; in many cases, lookups improved

---

image

Optimization in Numbers

image

Refactoring one class dropped JVM heap by ~3 GB. This was measured in production, not a test lab:

Verification steps:

  • Dual-write the old and new data structure
  • Compare memory metrics in real-time
  • Switch fully to new model after validation

Takeaway: General-purpose containers (`HashMap`, `HashSet`) and boxing can be surprisingly costly at scale.

---

1. Background

image

In our business pipeline, an interface filters product tags per city with complex logic:

  • Multi-condition `AND`/`OR` nesting
  • Multi-level expression trees
  • Runtime-only computations for some rules

Because these filters couldn’t be expressed efficiently in our search backend, we cached full product-tag data in memory keyed by city ID for real-time filtering.

⚠️ Without this cache, each request rebuilt the structure → massive temporary memory spikes.

---

2. Legacy Structure

public class RecallPlatformTagsResp extends BasePageResponse {
    private static final long serialVersionUID = 8030307250681300454L;
    private Map> resultMap;
}

Example:

[
    {1:["2246","2288","3388","1022"]},
    {2:["12246","12288"]}
]

Problem: At scale, nested `HashMap` + `HashSet` → huge overhead from:

  • Object headers
  • Boxing/unboxing
  • Hash table arrays
  • Node pointers

---

3. Tag Distribution Analysis

image

From 1 million sample items:

  • 80% ≤ 16 tags
  • 90% ≤ 32 tags
  • 99% ≤ 64 tags
  • Max tags < 128

Implications:

  • Tags are bounded and small → arrays suffice
  • Data is immutable after init → binary search works
  • Tags are integers → `int[]` avoids `String` + object overhead

---

Visual Memory Impact:

image
image

---

4. New Structure Choices

image

Alternatives tried:

  • FastUtil primitive maps (`Long2ObjectOpenHashMap`)
  • Sorted `int[]` + binary search
  • RoaringBitmap
  • Custom compact object pool

Test metrics: memory footprint, lookup speed, init cost, code complexity

---

Best Combo:

  • Key: FastUtil `Long2ObjectOpenHashMap` → no boxing, tighter layout
  • Value: sorted `int[]` → tiny footprint, fast binary search

Memory Effect:

  • Replace value (`HashSet` → `int[]`) in JDK `HashMap` → 1.3 GB → 64 MB
  • Then replace JDK map with FastUtil → 25 MB

---

Why `int[]` over `HashSet`:

| Approach | Complexity | Memory | Notes |

|------------------------|------------|--------|-------|

| HashSet.contains() | O(1) avg | High | Hash collisions possible |

| int[] + binary search | O(log k) | Low | Cache-friendly; predictable |

With k ≤ 128:

  • ≤ 7 comparisons worst-case
  • 90% of lookups ≤ 5 comparisons
  • Negligible latency impact; massive memory savings

---

5. Test Harness

JProfiler memory dumps compared:

  • `Map>`
  • `HashMap`
  • FastUtil `Long2ObjectOpenHashMap`
  • `BitSet`
  • RoaringBitmap
import it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap;
...

(full Java test code above)

---

Result: Long2ObjectOpenHashMap is optimal

public class RecallPlatformTagsResp extends BasePageResponse {
    ...
    private Map> resultMap; // legacy dual-write
    private Long2ObjectOpenHashMap itemTags = new Long2ObjectOpenHashMap<>(1_600_000);
}

---

6. Why FastUtil `Long2ObjectOpenHashMap`

image

Advantages

  • Primitive keys — no boxing (`long` direct)
  • Open addressing — faster, cache-friendly
  • Compact arrays — no linked lists or tree nodes

Internals:

long[] key;    // primitive keys
Object[] value;
mask = capacity - 1; // power of two

Hash: `(int)(key ^ (key >>> 32)) & mask`

---

Collision Handling

image

Linear probing: place colliding keys in next free slot

get(): probe until match or empty slot

remove(): use Backward Shift algorithm — slide items back to fill gaps, avoid breaking probe chains

---

Performance Notes

image
  • Load factor ≤ 0.7 recommended
  • Avoid sequential keys causing clustering
  • Initialize with estimated capacity
  new Long2ObjectOpenHashMap<>(16384);

---

Suitable Use Cases

✅ High-frequency `long → Object` lookups, GC-sensitive systems

❌ Concurrent writes without external sync; highly skewed keys

---

image

Final Summary

Before:

HashMap>

After:

FastUtil Long2ObjectOpenHashMap

Outcome:

  • Memory ↓ 94%
  • Speed maintained
  • GC pressure reduced

---

Key Takeaways:

  • Generic containers can hide massive overhead at scale
  • Let data characteristics drive structure choice
  • Simple primitives + targeted algorithms beat general-purpose designs for bounded, read-mostly datasets

Read more

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Harvard CS50: Introduction to Programming with R Harvard University offers exceptional beginner-friendly computer science courses. We’re excited to announce the release of Harvard CS50’s Introduction to Programming in R, a powerful language widely used for statistical computing, data science, and graphics. This course was developed by Carter Zenke.