I am now using tbb::concurrent_unordered_map in local scope which is often destroyed.
As a result of performance measurement, I found my overhead is due to its destruction.
As a serial test, I exchange it with std::unorderd_map, but I found it is perhaps slower than serial STL.
If you know how to speed up destruction of the container, or know countermeasure (e.g. memory pool, fast allocator, or flyweight),
I would appreciate it if you could teach it me.
For more complete information about compiler optimizations, see our Optimization Notice.