Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

cache_aligned_allocator and small allocations

New Contributor I
ive got a cache_aligned_allocator which I use to allocate matrices that are at most 16bytes...

now im unsure how cache_aligned_allocator works with allocations smaller than a cache line...

does it allocate 16 bytes and then add unusuable padding?

or does it allocate as many matrices there is room for in the cacheline and then pads the rest... and then when i allocate another matrix it will actually use one of these "padding matrices"...

basicly my problem is that i would be wasting to much memory and efficiency if it simply adds padding and i have a single matrix per cacheline...

does the cache_aligned_allocator solve this or do i have to allocate several matrices at a time manually?
0 Kudos
1 Reply

cache_aligned_allocator will pad each matrix to fill out a cache line. You'll have to decided if this is a waste of space or a bargain for avoiding false sharing.

tbb::scalable_allocator might be the right thing to use. If each thread uses tbb::scalable_allocator to allocate 16-byte objects, each thread's objects will be consecutively allocated with no extra padding. Different threads will allocate on different cache lines. The 16 generalizes to any power of 2 between 8 and the cache line size.

0 Kudos