- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Currently whenever I want my threads to use memory, I typically allocate the necessary memory outside the tbb calls (parallel_for, say) and provide the thread-called functors with the appropriate information,effectively creating a -very- primitive heap private to my threads.
(Typically, this would be of importance, if I used single-threaded mkl linear equation solver where a number of auxiliary buffers would be requested for the solution.)
I was wondering if the tbb allocators do that for me automatically (even effectively), and therefore I do not need to do all this "maneuvre" - honestly, no attempt here to "fish" what the allocators do -)).
If not, how much of a penalty is incurred by using the tbb allocators? -I know they are to be faster than the os provided ones.
Thank you in advance for your help,
Petros
Currently whenever I want my threads to use memory, I typically allocate the necessary memory outside the tbb calls (parallel_for, say) and provide the thread-called functors with the appropriate information,effectively creating a -very- primitive heap private to my threads.
(Typically, this would be of importance, if I used single-threaded mkl linear equation solver where a number of auxiliary buffers would be requested for the solution.)
I was wondering if the tbb allocators do that for me automatically (even effectively), and therefore I do not need to do all this "maneuvre" - honestly, no attempt here to "fish" what the allocators do -)).
If not, how much of a penalty is incurred by using the tbb allocators? -I know they are to be faster than the os provided ones.
Thank you in advance for your help,
Petros
Link Copied
6 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The scalable allocator does in fact have something like one heap per thread. Memory is allocated efficiently from the local "heap", and deallocating local memory is very fast compared to deallocating nonlocal memory.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The TBB scalableallocator will generally be much faster than using the standard heap and near the performance of your own targeted memory pool. The potential downsides (caveats) are:
Memory consumption will increase (pool retention by thread).
There may be allocation/deallocation patterns in your application that aggravate the memory consumption.
Potential issues when static objects allocate and deallocate memory (before main).
Potential issues where a DLL is using a different thread pool (e.g. OpenMP) and your app is using TBB with overloaded new/delete.
Don't let this caviats interfere with your experimentation with using the TBBscalable allocator. Much thought has been put into the allocators to yield a superior solution (for most programming situations).
Jim Dempsey
Memory consumption will increase (pool retention by thread).
There may be allocation/deallocation patterns in your application that aggravate the memory consumption.
Potential issues when static objects allocate and deallocate memory (before main).
Potential issues where a DLL is using a different thread pool (e.g. OpenMP) and your app is using TBB with overloaded new/delete.
Don't let this caviats interfere with your experimentation with using the TBBscalable allocator. Much thought has been put into the allocators to yield a superior solution (for most programming situations).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Raf,Jim,
Thank you very much for your response. Knowing this, lifts a big weight off my shoulders.
One final clarification: your comments on the tbb scalable allocator carry over to the cache aligned one as well?
Thank you for all your help,
Petros
Thank you very much for your response. Knowing this, lifts a big weight off my shoulders.
One final clarification: your comments on the tbb scalable allocator carry over to the cache aligned one as well?
Thank you for all your help,
Petros
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Unless I missed something, the cache-aligned allocator is (still) based on the same code, so yes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Raf,
Thank you for the confirmation.
Petros
Thank you for the confirmation.
Petros
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Raf,
Thank you for the confirmation.
Petros
Thank you for the confirmation.
Petros
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page