When HashMap starts killing production: the engineering story of ConcurrentHashMap

Imagine a typical production service.

32 CPU
hundreds of threads
configuration / session / rate limits cache
tens of thousands of operations per second

And somewhere inside — a regular Map.

At first, everything looks harmless.

Map<String, Session> sessions = new HashMap<>();

Then a couple of threads appear.

Then dozens.

Then production load begins.

And suddenly:

latency spikes
CPU goes to 100%
strange freezes appear
sometimes — infinite loops in HashMap

This is not a hypothesis. These are real production incidents that occurred in large Java systems before the advent of proper concurrent structures.

It is from such stories that ConcurrentHashMap was born.

But to understand why it is designed this way, one must go through the entire evolution:

HashMap
   ↓
synchronized Map
   ↓
Segmented ConcurrentHashMap (JDK7)
   ↓
CAS + bin locking (JDK8)
   ↓
tree bins + lock striping + resize migration

And this is one of the most beautiful engineering evolutions in the JVM.

Tree bins are a mechanism in HashMap and ConcurrentHashMap (starting from Java 8), whereby a bucket with a large number of collisions is automatically transformed from a linked list to a balanced red-black tree. This prevents performance degradation when there are many elements with the same hash: instead of linear complexity O(n) for search, insertion, and deletion operations, they are performed in O(log n). Such a transformation occurs when the number of elements in a bucket exceeds a specified threshold (usually 8), and serves as protection against both bad hashCode and potential hash collision attacks.

First naive idea: just synchronized

The most obvious protection for HashMap:

Map<K, V> map = Collections.synchronizedMap(new HashMap<>());

Or:

public synchronized V get(K key)
public synchronized V put(K key, V value)

This works.

But only until real concurrency appears.

What happens under load

32 threads
200k ops/sec

Each call:

lock(map)
   → read/write
unlock(map)

We get a global lock.

Diagram

Thread1 ─┐
Thread2 ─┤
Thread3 ─┤
Thread4 ─┤ → LOCK(map) → operation → unlock
Thread5 ─┤
Thread6 ─┘

All threads line up in a queue.

parameter	effect
throughput	falls
latency	increases
CPU	idles
contention	huge

Even get() blocks all other get().

This is a fundamental problem.

Second attempt: lock striping

The Java engineers decided:

"What if we divide the map into several independent parts?"

Thus, ConcurrentHashMap JDK7 was born.

Segments

ConcurrentHashMap
 ├─ Segment 0
 ├─ Segment 1
 ├─ Segment 2
 └─ Segment 3

Each segment is a separate mini HashMap with its own lock.

segment = hash(key) % segments

Now:

Thread1 → Segment1 → lock
Thread2 → Segment2 → lock
Thread3 → Segment3 → lock

And they do not interfere with each other.

Operation code

Segment<K,V> s = segments[(hash >>> segmentShift) & segmentMask];
s.lock();

try {
    s.put(key, value);
} finally {
    s.unlock();
}

Performance improvement

threads	synchronizedMap	segmented CHM
1	1x	1x
8	0.7x	5x
32	0.3x	18x

But new problems arose.

Problem #1 — Fixed Parallelism

The number of segments is fixed.

16 segments

But the server:

64 threads

As a result again:

Thread1 → Segment3
Thread2 → Segment3
Thread3 → Segment3

Contention returns.

Problem №2 — resize

Resize — the most difficult part of any hash map.

capacity = capacity * 2

In the segmented model:

resize → lock entire segment

This is expensive.

Problem #3 — memory overhead

Each Segment contains:

ReentrantLock
table
metadata

On large maps, this turns into memory waste.

The JVM engineers decided:

We need a structure without segments.

Thus, ConcurrentHashMap JDK8 was born.

Architecture of ConcurrentHashMap (JDK8)

Node<K,V>[] table

Each bucket is managed separately.

table
 ├─ bin0
 ├─ bin1
 ├─ bin2
 ├─ bin3

Main idea

operation	mechanism
empty bucket	CAS
collision	synchronized
resize	cooperative

End-to-End put operation

map.put("user42", session);

Step 1 — hash

static final int spread(int h) {
    return (h ^ (h >>> 16)) & HASH_BITS;
}

This protects against bad hashCode.

Step 2 — index

index = (n - 1) & hash

Bit mask is faster than modulo.

Step 3 — CAS insertion

if (casTabAt(tab, i, null,
      new Node<>(hash, key, value)))
      break;

CAS = Compare And Swap.

if (table[i] == null)
    table[i] = node

But atomically at CPU level.

CAS is performed through Unsafe.compareAndSwapObject and maps to the CPU instruction cmpxchg.

Why this is important

CAS avoids lock.

Thread1 → CAS success
Thread2 → CAS fail → retry

No kernel switch.

When CAS Does Not Help

If the bucket is not empty:

bin0 → node → node → node

Local lock is used:

synchronized (f) {
    // modify chain
}

Lock is only for one bucket.

When a bucket turns into a tree

If:

bucket size > 8

List turns into a Red-Black Tree.

O(log n)

Diagram

before

bucket
  ↓
  N → N → N → N → N


after

        N
      /   \
     N     N
    / \   / \

Resize — the most difficult part

ConcurrentHashMap performs resize without global pause.

Cooperative migration

old table
 ├─ bin0
 ├─ bin1
 ├─ bin2
 └─ bin3

Thread1 → bin0
Thread2 → bin1
Thread3 → bin2
Thread4 → bin3

Threads help move buckets.

ForwardingNode

ForwardingNode

This is a signal:

"this bucket has already moved"

Java Memory Model

volatile Node<K,V>[] table;

This ensures visibility between threads.

put → happens-before → get

CPU Cache Effects

CAS invokes:

invalidate cache line

If many threads hit one bucket:

cache ping-pong

Performance drops.

Impact on GC

Node
 key
 value
 next

Millions of Node → load on GC scanning.

Real production case

Map<IP, Counter>

load:

150k req/sec

Replacement:

synchronized HashMap
↓
ConcurrentHashMap

metric	before	after
latency p99	120ms	8ms
CPU	90%	40%
throughput	30k rps	150k rps

Main trade-offs

solution	plus	minus
CAS	without lock	retry loops
bucket lock	simple	contention
tree bins	protection against hash attack	memory
cooperative resize	no stop-the-world	complexity

The most important engineering idea

ConcurrentHashMap is a combination of three strategies:

optimistic (CAS)
pessimistic (locks)
structural (tree bins)

Each is used only when needed.

Engineering Intuition

If:

high concurrency
frequent reads
rare writes

ConcurrentHashMap is ideal.

But if:

writes dominate

CAS can turn into a retry storm.

Main conclusion

ConcurrentHashMap works quickly not because it is a completely lock-free structure. In fact, it actively uses locks inside. Its performance is achieved through another, much more important engineering idea — minimizing contention between threads.

Instead of one global lock that turns all operations into a sequential queue, ConcurrentHashMap breaks the work of the structure into many small independent sections. Each thread tries to work with the most isolated part of the data structure, which dramatically reduces the likelihood that two threads will block each other.

For this, several different synchronization strategies are combined inside the map:

CAS (Compare-And-Swap) — used for fast optimistic operations, for example, for inserting an element into an empty bucket without using a lock. If the operation succeeds, the thread does not enter the critical section at all.
Fine-grained locks — when collisions do occur, the lock is applied not to the entire map, but only to a specific bucket. This dramatically reduces the scope of the lock and allows other threads to continue working with the remaining parts of the structure.
Tree bins — if too many elements accumulate in one bucket, the linked list automatically transforms into a balanced red-black tree. This protects the structure from performance degradation when there are many collisions and prevents operations from turning into a linear search.
Cooperative resize — when increasing the table, the work of moving elements is distributed among threads. Several threads help move buckets into the new table, instead of stopping the entire data structure during the resize.

Thanks to the combination of these mechanisms, ConcurrentHashMap can operate efficiently at a very high degree of parallelism: dozens and hundreds of threads can perform read and write operations practically independently of each other.

This makes ConcurrentHashMap one of the best examples of evolutionary engineering in the JVM — a structure that gradually evolved from simply locking the entire map to a complex system of optimistic operations, local locks, and adaptive data structures.

When HashMap starts killing production: the engineering story of ConcurrentHashMap

First naive idea: just synchronized

What happens under load

Diagram

Second attempt: lock striping

Segments

Operation code

Performance improvement

Problem #1 — Fixed Parallelism

Problem №2 — resize

Problem #3 — memory overhead

Architecture of ConcurrentHashMap (JDK8)

Main idea

End-to-End put operation

Step 1 — hash

Step 2 — index

Step 3 — CAS insertion

Why this is important

When CAS Does Not Help

When a bucket turns into a tree

Diagram

Resize — the most difficult part

Cooperative migration

ForwardingNode

Java Memory Model

CPU Cache Effects

Impact on GC

Real production case

Main trade-offs

The most important engineering idea

Engineering Intuition

Main conclusion

Оставить комментарий

Useful Articles:

New Articles:

When HashMap starts killing production: the engineering story of ConcurrentHashMap

First naive idea: just synchronized

What happens under load

Diagram

Second attempt: lock striping

Segments

Operation code

Performance improvement

Problem #1 — Fixed Parallelism

Problem №2 — resize

Problem #3 — memory overhead

Architecture of ConcurrentHashMap (JDK8)

Main idea

End-to-End put operation

Step 1 — hash

Step 2 — index

Step 3 — CAS insertion

Why this is important

When CAS Does Not Help

When a bucket turns into a tree

Diagram

Resize — the most difficult part

Cooperative migration

ForwardingNode

Java Memory Model

CPU Cache Effects

Impact on GC

Real production case

Main trade-offs

The most important engineering idea

Engineering Intuition

Main conclusion

Оставить комментарий

Useful Articles:

New Articles:

Subscribe to Lesnih.com Updates 💻

Subscribe to Lesnih.com Updates 💻