Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

TBB Memory Allocator V/s Hoard Memory Allocator

csv610
Beginner
1,042 Views

Hello,

Has anyone done any benchmarking of TBB memory allocators and Hoard Memory allocator with real applications ? If yes, I would like to see the results.

Do the concurrent containers in TBB use scalable memory allocators by default ?

Thanks.

With regards
0 Kudos
2 Replies
RafSchietekat
Valued Contributor III
1,042 Views
In Intel Technical Journal, Volume 11, Issue 4, 2007, in the Larson benchmark as obtained from http://www.hoard.org (the site dedicated to the Hoard memory allocator) the TBB allocator trumps them all in any number of threads greater than one (including Hoard, which came out as second best in any number of threads greater than four), which seems rather reassuring. Similar results are obtained for another allocator's own benchmark and a non-demo benchmark (TBB allocator best throughout, Hoard fourth to last of six, on both benchmarks). Only the false-sharing micro-benchmark, which compares allocators against themselves on one thread, shows one or two other allocators scaling somewhat better than TBB, one of them Hoard (by at most 10%), but Hoard consistently starts out as the worst on one thread on two of the other benchmarks with TBB the best, and if false sharing is your concern then TBB also has a cache_aligned_allocator. I'm not worried, but it might be nice to have those results confirmed by independent sources (I'm sure they put in some pause instructions for their competitors in the demo benchmarks they obtained elsewhere :-)) or for "real" applications, and maybe things have changed since a yearfive months ago.

If you still want to plug in your own allocator, you can most easily do that starting with the 2008-04-02 development release, with TBB's own scalable allocator the default (although there are places where cache_aligned_allocator is hard-coded for some internal allocations, if I can generalise from concurrent_hash_map).

0 Kudos
Alexey-Kukanov
Employee
1,042 Views

Raf_Schietekat:
... (I'm sure they put in some pause instructions for their competitors in the demo benchmarks they obtained elsewhere :-))

I hoped you were joking, Raf:)

csv610:
Has anyone done any benchmarking of TBB memory allocators and Hoard Memory allocator with real applications ? If yes, I would like to see the results.

Do the concurrent containers in TBB use scalable memory allocators by default ?

In fact, real applications are quite different in allocation behavior than any microbenchmarks, and benefits or losses from using a particular allocator vary between applications, and I believe no benchmarking data other than your application itselfcan give you the answer which allocator is better for you. While TBB allocator outperforms Hoard in pure allocation speed for small-size objects, as far as I recall Hoard has slightly better scalability trend. And allocation speed is not necessary the determinant forperformance of your application; e.g. data layout in memory can have bigger impact.

As for the containers - yes, TBB containers use the TBB allocator by default (provided the tbbmalloc shared library is available, otherwise "regular" malloc is used); and the ability to use any C++ compliant allocator might also be useful, including for the need of experiments.

0 Kudos
Reply