Have you guys seen mimalloc? https://github.com/microsoft/mimalloc has some interesting benchmark results for tbbmalloc for peak working set (last set of graphs on that page). When we first started using jemalloc on Linux and tbbmalloc on Windows, it was our experience that the peak working set with tbbmalloc was much worst and we had attributed this to the fact that we allocate on one thread and free on another. To ameliorate this, we had resorted to calling scalable_allocation_command(TBBMALLOC_CLEAN_ALL_BUFFERS) after every simulation time step. Reading the peak working set benchmarks on the mimalloc's readme.md it seems to suggest that tbbmalloc actually holds its own here with respect to jemalloc for work loads that do this (see the larsonN and mstressN results). However, the redis benchmark shows tbbmalloc being much worst than jemalloc. It might worth investigating the behaviour here.
PS. Benchmark is in a separate repo https://github.com/daanx/mimalloc-bench
Hi! We didn't see this project. Thanks for sharing a set of benchmarks to look at, we will investigate!
Also, I left some comments regarding allocator cleanup routines some time ago - https://github.com/intel/tbb/issues/172. Maybe it will be helpful. Also, we improved memory cleanup operations and solved several problems for the memory consumption in TBB U6 TBB 2019 Update 6 release.