Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Memory consumption using scalable_* and multiple threads

AndrewC
New Contributor III
271 Views

I am seeing excessive memory consumption when using the scalable_malloc/scalable_free "C" routines and TBB 4.1 (as part of Parallel Studio) that I do not see when using malloc()/free() or the mkl memory allocation routines.

In a loop, I create and destroy threads that make many calls into scalable_malloc and scalable_free. There are no scalable_ calls "across threads" or from the main thread. These calls are all balanced so no allocated  memory is being left dangling.

Each time through the loop memory consumption seems to be increasing as if some thread specific buffers are not being returned when the threads are being destroyed.

MKL has a function MKL_Free_Thread_Buffers that I can call at the end of a thread, just before it dies. Does TBB need a similar call?

0 Kudos
6 Replies
Vladimir_P_1234567890
271 Views

Hello,

Intel TBB 4.2 introduced scalable_allocation_command() function to clean-up either thread buffers or all buffers

More details are here  http://software.intel.com/en-us/node/468118

--Vladimir

0 Kudos
Alexandr_K_Intel1
271 Views

As Vladimir mentioned, there is an call similar to MKL_Free_Thread_Buffers(), but there is no need for it at thread’s termination time, as all per-thread buffers have to be released automatically. Are sequence of allocations is different between iterations of your outer loop (we have to understand is it memory fragmentation or memory leak)? How big is regression in memory consumption in comparison to system allocator?

I’d love to see the reproducer, if the regression is big.

0 Kudos
AndrewC
New Contributor III
271 Views

I am seeing many megabytes of extra memory usage when using the scalable allocators.

I can try to come up with a reproducer. But it seems 4.2 addresses this issue.

0 Kudos
AndrewC
New Contributor III
271 Views

but there is no need for it at thread’s termination time, as all per-thread buffers have to be released automatically.

How is that possible? How can TBB memory allocators "know" a particular thread has died and that particular thread's buffers can be released? I am using a non TBB threading library (boost::threads) on Windows.

Interestingly , as a side note, I was using OMP threading and this was not an issue. That's because OMP starts up a thread pool and uses the same threads during program execution, so threads are not being repeatedly created and destroyed...

0 Kudos
Alexandr_K_Intel1
271 Views

How is that possible? How can TBB memory allocators "know" a particular thread has died and that particular thread's buffers can be released?

Under Windows, DllMain is called with with DLL_THREAD_DETACH argument on thread termination for each DLL.

Your observation about OpenMP is important. Interesting that there were no known issues (and so, fixes) related to memory leaks during thread termination.

0 Kudos
jimdempseyatthecove
Honored Contributor III
271 Views

Can you encapsulate your use of boost create thread/exit thread such that is uses a pool?

YourCreateThread :: if(ThreadAvailableInPool) takeFromPool else createThread

YourEndThread :: returnThreadContextToYourPool

Jim Dempsey

0 Kudos
Reply