Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2466 Discussions

Do scalable allocators work like a pool...how deep?

BRIAN_R_Intel
Employee
333 Views

Is scalable_realloc a pool or does it only really grab the small chunk needed?

My applications scales poorly, though each thread is completely independent and work/overhead ratiois significant. It happens to allocate 250K tiny vectors (about 5 pointers in each) in a little under 2 seconds. 16 threads are doing this simultaneously. My test applicationthat performssimilar "work" but does arithmetic in place of all these allocations scales perfectly.

Does scalable_realloc etc grab significant size chunks and dole them out as needed (ala a pool) or all these little allocations between threadspossibly competing via the OS? I am tempted to make my own pool but I loose lots of other tbb benefits.I do not see an appreciable change when switching from standard allocators to the scalable ones. Does that mean my scaling issues are elsewhere?

Thread profiler seems to think all threads are nearly 100 utilized. Could it perhapsbe mis interprets waiting on memory as "work.

Thanks,

Brian Rundle

0 Kudos
3 Replies
Dmitry_Vyukov
Valued Contributor I
333 Views
brundle:

Thread profiler seems to think all threads are nearly 100 utilized. Could it perhapsbe mis interprets waiting on memory as "work.



Yes, it could. Accesses to main memory, cache-line transfers between cores (i.e. sharing or false sharing), pipeline stalls etc, all are considered as CPU useful work. So 100% CPU utilization doesn't mean 100% efficiency. Cache-line transfer can take up to 300 cycles, so in the limit efficiency can be only 0.3%.

As for your main question, I believe it must work like a pool, but I don't know exactly for now.

0 Kudos
RafSchietekat
Valued Contributor III
333 Views
Allocations up to 8-something kB (currently) work like a pool, bigger ones go straight to malloc() by default.

0 Kudos
ARCH_R_Intel
Employee
333 Views

One thing totry would be to take a flat VTune profile and see where it says the time is being spent.

0 Kudos
Reply