Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Using TBB with maps

khlupin
Beginner
411 Views

I would appreciate any comments on whether my conceptual thinking is correct.

My code now has a large number of objects that "live" in std::map. The function that I would like to parallelize reads and updates one of the values in the object. Which objects and in which order are processed is not known in advance, so the map cannot be split into blocks for parallel processing. Obviously, I do not want to block access to the whole map for each transaction.

Whaty do you think about creating spin_mutexes within each of the individual objects to control access to them?

Thanks a lot.

0 Kudos
5 Replies
RafSchietekat
Valued Contributor III
411 Views
Use tbb::concurrent_hash_map instead of std::map; an accessor is like a tbb::spin_rw_mutex (disregarding delete issues).
0 Kudos
Dmitry_Vyukov
Valued Contributor I
411 Views
khlupin:

What do you think about creating spin_mutexes within each of the individual objects to control access to them?


If you don't modify the map itself (from your description it seems that it's the case), than it's perfectly Ok to protect only individual objects. I think it's a good choice to create spin_mutex per object. It's a kind of fine-grained locking.
Also you can consider to align and pad object so that each individual object will be situated on separate cache-line. This will eliminate false-sharing, and can greatly improve performance. (for modern Intel processors cache-line size is 64 bytes)


0 Kudos
khlupin
Beginner
411 Views

Thank you for the advice. You are correct - the map will not be modified concurrently.Could you point me in the right direction to read up on aligning and padding the objects?

Regards,

Roman

0 Kudos
Dmitry_Vyukov
Valued Contributor I
411 Views
khlupin:

Thank you for the advice. You are correct - the map will not be modified concurrently.Could you point me in the right direction to read up on aligning and padding the objects?




I think you can read "Intel 64 and IA-32 Architectures Optimization Reference Manual":
http://www.intel.com/products/processor/manuals/

8.4.5 Prevent Sharing of Modified Data and False-Sharing
8.4.6 Placement of Shared Synchronization Variable
8.6.2.1 Minimize Sharing of Data between Physical Processors

The main point is that object must be aligned to (i.e. object's address must start at) cache-line size, and no 2 objects must reside in one cache-line. Cache-line size is usually 64 bytes, but processor dependent.

Here is quick example:

struct X {
 int useful_data1;
 int useful_data2;
 char pad [cacheline_size - 2 * sizeof(int)];
};

// _aligned_malloc is MSVC specific function
X* array = (X*)_aligned_malloc(sizeof(X) * count_of_elements, cacheline_size);

This way every element resides in own cache-line.
0 Kudos
Alexey-Kukanov
Employee
411 Views
In TBB, cache_aligned_allocator is intended to do exactly that: no two objects allocated by it can ever share a cache line.
0 Kudos
Reply