With MongoDB 8.0, the database engine takes another step forward in performance optimization, particularly in how it manages memory. One of the most impactful changes under the hood is the updated version of TCMalloc (Thread-Caching Malloc), which affects how the server allocates, caches, and reuses memory blocks.
For workloads with high concurrency, long-running queries, or mixed read/write patterns, the new TCMalloc can deliver noticeable performance gains.
This article explains what TCMalloc is, how it influences performance and memory fragmentation, and what differences you can expect before and after upgrading to MongoDB 8.0.
TCMalloc (Thread-Caching Malloc) is a memory allocator originally developed by Google. It replaces the standard malloc() and free() calls used by applications written in C/C++ with a faster, multithread-optimized alternative.
In simple terms, TCMalloc handles memory requests more efficiently by caching allocations per thread or per-CPU (default), avoiding the contention that can happen when multiple threads try to allocate or free memory at the same time.
TCMalloc may operate in one of two fashions:
In both cases, these cache implementations allow TCMalloc to avoid requiring locks for most memory allocations and deallocations. It ends in low memory fragmentation and reduced system calls that in the majority of cases provides better performance.
MongoDB has used TCMalloc as its default allocator, but version 8.0 includes a major upgrade to a newer implementation aligned with upstream Google TCMalloc changes that uses per-CPU caches, instead of per-thread caches.
This brings improved multithreaded scalability, better memory release behavior to the OS, more predictable RSS (Resident Set Size) under heavy workloads.
The upgrade particularly benefits deployments where:
Needless to say, because of this under the hood change, MongoDB 8.0 is declared to be faster than previous version 7.0 for a lot of use cases.
The official documentation says that MongoDB 8.0 introduces significant performance improvements from MongoDB 7.0, including, but not limited to:
Probably the improvement is not only from TCMalloc, but it could be the main contributor.
If you are a long time user of MongoDB, you probably know that one of the more common best practices for OS tuning was to disable THP. Starting from MongoDB 8.0 the best practice is exactly the opposite: in order to benefit from the new TCMalloc, THP now must be enabled.
The following conditions must be checked to ensure TCMalloc can really use the new per-CPU caches:
A few details about Rseq (Restartable Sequences). Rseq lets user-space code execute small critical sections that are guaranteed to run atomically on the same CPU, without using locks or syscalls in the fast path. Some operations are extremely common and performance-critical, like: updating per-CPU counters, accessing per-CPU data structures, fast memory allocators and schedulers. In order to benefit of it, TCMalooc must be the one to register an rseq structure.
To verify that TCMalloc is running with per-CPU caches, ensure the following from the serverStatus:
Look at the following page for more details:
https://www.mongodb.com/docs/v8.0/administration/tcmalloc-performance/
Let’s now do some tests running the same kind of workloads and compare MongoDB 7.0 vs MongoDB 8.0.
The servers used for the tests had the following specifications:
POCDriver was used to generate the workloads. Every test ran for 10 minutes on both servers using 4 parallel threads.
The two versions compared were Percona Server for MongoDB 7.0.26-14 and Percona Server for MongoDB 8.0.16-5.
Here are the results of the tests. Higher is better.
avg ops per sec
| PSMDB 7.0 | PSMDB 8.0 | % improvement | |
| INSERTS | 55,784 | 71,752 | +28.62% |
| _id LOOKUPS | 1.883 | 2,529 | +34.31% |
| UPDATES | 17,178 | 17,963 | +4.57% |
| RANGE QUERIES | 753 | 874 | +16.07% |

avg ops per sec
| PSMDB 7.0 | PSMDB 8.0 | % improvement | |
| INSERTS | 0 | 0 | – |
| _id LOOKUPS | 0 | 0 | – |
| UPDATES | 64,091 | 78,568 | +22.59% |
| RANGE QUERIES | 411 | 565 | +37.47% |

avg ops per sec
| PSMDB 7.0 | PSMDB 8.0 | % improvement | |
| INSERTS | 0 | 0 | – |
| _id LOOKUPS | 10.647 | 13,279 | +24.72% |
| UPDATES | 1,408 | 1,647 | +16.97% |
| RANGE QUERIES | 307 | 339 | +10.42% |

avg ops per sec
| PSMDB 7.0 | PSMDB 8.0 | % improvement | |
| INSERTS | 0 | 0 | – |
| _id LOOKUPS | 0 | 0 | – |
| UPDATES | 1,372 | 1,615 | +17.71% |
| RANGE QUERIES | 7,779 | 8,307 | +6.79% |

As promised by the official documentation, MongoDB 8.0 is really faster than MongoDB 7.0. The tests provided results that confirm the benefits declared. Obviously, the real benefits depend on multiple factors, like a customized tuning, a different hardware or other things. You could face a specific scenario that cannot provide the same kind of improvements we had. For this reason, running tests against a new version is always recommended before moving a version to production. Anyway, we are confident the benefits provided by the new TCMalloc with per-CPU caches are really impressive.
Resources
RELATED POSTS