- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am currently using scalable_malloc instead of classic malloc to allocate the memory I need for my program. It works nicely, I get very interesting speedup because of that. But I think I don't understand what is really the scalable memory allocation, what does it means ?
I've read the explanation about it in the book but I'm not right sure to understand the way it works, it is said in the book that scalable allocator allocates and frees memory in a way that scales with the number of processors, I'm not sur to understand, is it about the caches of the processors ?
Thank you for you enlightenment
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
scalable_malloc, without getting too technical, is like each thread having a private heap. When an allocation occurs, and the available memory is in the private heap (to the thread), the allocation occurs without a critical section. Should the private heap have insufficient resources, then the standard heap is called (with critical section) to add another hunk of memory to the private heap. This technique reduces the number of times the application allocation/deallocation passes through the critical section (permits parallel allocations).
An optimization of the private heap is to maintain pools of similar sized allocations (generally in 16/32/64 byte incriments).
scalable_malloc is not a "free lunch". The cost is in having a larger memory requirement.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's a plus.
#1 "scalable_malloc is not a "free lunch". The cost is in having a larger memory requirement."
Often true, but probablymaybe not in general.
(Edited)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Often true, but probablymaybe not in general.
On 64-bit platform there is generally no issue (of larger memory requiement)with using the scalable_malloc.
On 32-bit platform (or 32-bit applications run on 64-bit platform), it may be advisable to NOT overload new/delete, and the selectively use the scalable_malloc/free routines for the few high frequency malloc/free objects. (On Windows, you may also want to enable the Low Fragmentation Heap feature).
There is an additional issue of where an object is scalable allocated from one thread and deallocated by a different thread. This may cause either memory consumption issues or additional latencies. This is not an issue where allocation/deallocations are performed on a call stack (e.g. ctor/dtor of stack frame objects). But it can be a problem when an arbitrary thread can delete an object pointed/referenced by a concurrent queue of object pointers/references.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't really know how much impact there is for inter-thread (de)allocation patterns, but maybe somebody could advise/remind us on whether the issue has been solved where memory could be exhausted by a thread that would only deallocate memory from other threads (if that was indeed the situation): it didn't seem impossible to solve, some time has passed since I first remember it being discussed, and there have been some changes in the meantime (perhaps it was fixed in 4.0?).
Did I misinterpret or overlook anything?
(Edited) As indicated (plus anything I forgot to mark).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
scalable_malloc and malloc.
Quoting jimdempseyatthecove
Often true, but probablymaybe not in general.
...
On 32-bit platform (or 32-bit applications run on 64-bit platform), it may be advisable to NOT overload new/delete...
needed for C++ classes andthe most common is a built-inmemory leaks detection.
Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm sure that Jim didn't mean that C++ new/delete should never be redirected, but rather questioned the usefulness of redirecting all C++ (de)allocation requests to the TBB scalable allocator regardless of size. I think that we only differ on how cautious you need to be (opt-in vs. opt-out, so to speak), but, again, my intuition would easily yield to hard data.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Correct. You might find it convenient to experiment with overloading new/delete and then see if your (test) application fails on memory allocations. Then back-off use of overloaded new/delete as warranted.
*** Keep in mind that the developer of a program is not necessarily the user of the program, and the user (some user)of the program is likely to use the program in a manner the developer overlooks (or feels is stupid). This has to be factored in with the trade-off evaluation of performance vs memory requirements.
In the QuickThreadscalable allocator, you can overload new/delete (malloc/free)....
and then optionally use a global flag to disable/enable the scalable allocator. This means the programmer can:
a) For allocations that occuronce (via static object ctor) or once/infrequently you can start with the scalable allocator flag disabled. The advantage of this is scalable allocators (generally) use thread-by-thread pools of similar sized nodes. When a pool is empty (or first use), a slab of memory is allocated and fragmented into a new pool of similar sized nodes. When these static/once/infrequent allocations are of sizes not generally used elsewhere in the application, then disabling the scalable allocator for these allocations conserves the memory of what would have been unused in the pool(s) had they been scalably allocated.
b) For long lived objects (where critical section contention is a very small fraction of 1% of overhead) and where the object size may constitute a pool not used elsewhere, you can elect to temporarily disable the scalable allocator.
c) You can add a feature, say enabled via environment variable, which the user (or your customer service rep) can use to disable scalable allocation on those systems (or applications) that experience memory allocation failures. IOW for those situations where an application's memory requirement is too large when using scalable allocator, you can disable the scalable allocation feature and possibly manage to run.
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page