- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I use in my IA project a concurrent hashtable with almost 20 000 000 items. There is no problem with insert or find operations but when I want to destroy all elements, it takes a lot of time...
This is my code :
typedef tbb::concurrent_unordered_set <Item*, Item::Hash, Item::Equality> ConcurrentHashTable;
ConcurrentHashTable m_explored;
// Fill the hash table with 20 000 000 items
...
// Release memory before destroying the hash table for (auto& item : m_explored) { delete item; } // Destroy the hashtable
m_explored.clear(); // performance issue here.. m_explored = ConcurrentHashTable(); // same performance issue here..
It takes about 10 seconds to clear the entire hashtable...
With the std::unordered_set, it takes only 2s.
How to fix it ? (I use Windows 10)
Note that I use the last version of oneTBB : 2021.5
Thanks
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for reaching out to us.
you are observing performance decrease with tbb::concurrent-unordered_set might be because tbb::concurrent_unordered_set does not support concurrent erasure.
You can try using tbb::concurrent_hash_map as it supports concurrent insertion, lookup, and erasure.
Please refer to the below link for more details:
Thanks & Regards,
Noorjahan.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don’t think it is related to concurrent erasure, since there is no concurrent erasure in the example.
I think it is due to the inherent differences between serial and concurrent containers. There is sometimes a performance penalty for using a concurrent container over using a sequential container. The concurrent containers are designed to scale as multiple thread are accessing them concurrently. To safely support concurrent access, the internal structures are also different, including the need for additional memory, and this can lead to a penalty for even seemingly simple, non-concurrent operations such as clear. Even so, a 5x slowdown is unexpectedly large! In previous cases, such as this one, we have found that using the tbb::scalable_allocator class with the container can reduce some of these overheads. You can find a deeper analysis of the possible causes in that other case. When I ran your test case on my system, I saw a slowdown but not of the same magnitude as yours. Perhaps you can check if you are using the scalable_allocator and if not, see if that helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Michael,
Indeed it is not a problem with erasure.
I am not sure to be able to use scalable_allocator because according to the documentation :
Thescalable_allocator
requires the memory allocator library. If the library is missing, calls to the scalable allocator fail. In contrast toscalable_allocator
, if the memory allocator library is not available,tbb_allocator
falls back onstd::malloc
andstd::free
.
I am not sure to have the "memory allocator library". Do I need another DLL ?
And what is exactly the scalable allocator, I don't understand the purpose of it when reading the official documentation :
Note that your two links points on the same resource.
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
>>I am not sure to have the "memory allocator library". Do I need another DLL?
The scalable_allocator allocator template requires that the TBBmalloc library be available. This class is defined with #include <tbb/scalable_allocator.h>.
Please refer to the below link for more details:
https://oneapi-src.github.io/oneTBB/main/tbb_userguide/Scalable_Memory_Allocator.html
>>what is exactly the scalable allocator, I don't understand the purpose of it
Please refer to the below link to understand it better on scalable allocator and get back to us if you face any issues.
https://link.springer.com/chapter/10.1007/978-1-4842-4398-5_7
Thanks & Regards,
Noorjahan.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We haven't heard back from you. Could you please provide an update on your issue?
Thanks & Regards,
Noorjahan.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Same issue with scalable allocator. Note that I also have a 3rd DLL libtbbmalloc_proxy. What is it for ?
Finally I don't use anymore the tbb unordered_set due to its performance with millions of entries. Maybe something could be improved in its implementation ?
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
tbbmalloc proxy redirects ALL application calls to scallable_malloc. You can also write your own new/delete operators to define what should go to scalable allocator.
This is a good article describing TBB proxy approach:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One more thing, could you provide a complete reproducer (source) with the scalable allocator and unordered set so I can file an internal ticket. If you could also consolidate all information regarding TBB version, OS version, build command you used, run command you used, etc. that would be greatly appreciated.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
In case of no response in 5 days since now, the ticket won't be supported by Intel anymore.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page