Scalable cache_aligned_allocator?

AJ13 · ‎12-07-2007

Hi,

Today I saved a lot of time by learning that scalable_allocator already pooled memory. It's nice to find that a library that should probably do something for you, really does already. I'm somewhat surprised that I missed that fact, however I've been saved a lot of time and effort in learning this now rather than later.

The next question, however, was whether or not the cache_aligned_allocator implements pooling. I examined the source code, and found that malloc() was called directly rather than scalable_malloc(). I would take this to imply that in fact the cache_aligned_allocator does not operate from a memory pool.

What is the reason for this? Would it make sense to create yet another allocator, which does pool memory as well as allocate memory on cache-line boundaries? Perhaps this is computationally too difficult to do... finding chunks of memory in the pool that are the right size might be too much computation to do in an allocation call?

Thanks.

Alexey-Kukanov · ‎12-10-2007

aj.guillon:
Hi,
The next question, however, was whether or not the cache_aligned_allocator implements pooling. I examined the source code, and found that malloc() was called directly rather than scalable_malloc(). I would take this to imply that in fact the cache_aligned_allocator does not operate from a memory pool.
What is the reason for this?

It's probably not obvious from the code that cache_aligned_allocator uses scalable_malloc if available, otherwise (and in some other cases) falls to malloc. The reason is the same as I told here: making sure TBB is able to work even if the scalable allocator library is absent. You shouldn't just believe my word :) - look for MallocHandler and see how it is used.

aj.guillon:
Perhaps this is computationally too difficult to do... finding chunks of memory in the pool that are the right size might be too much computation to do in an allocation call?

We plan improvements to the TBB allocators, and supporting aligned allocation in scalable_malloc is amongst those. And yes, it won't come for free; either it will be slower, or (more likely) it will pad the memory block. As you might guess, cache_aligned_allocator also adds some padding; so it will need to be customized to avoid excessive padding when used together with scalable_malloc.

By the way, the scalable allocator takes own actions to reduce false sharing; namely, different threads can not allocate from the same cache line. Same thread, however, can allocate a few small objects from the same cache line. This is different from the cache_aligned_allocator behaviour; the latter is thread-oblivious and treats each allocation as deserving separate cache line(s).