Community
cancel
Showing results for 
Search instead for 
Did you mean: 
khlupin
Beginner
49 Views

Using TBB with maps

I would appreciate any comments on whether my conceptual thinking is correct.

My code now has a large number of objects that "live" in std::map. The function that I would like to parallelize reads and updates one of the values in the object. Which objects and in which order are processed is not known in advance, so the map cannot be split into blocks for parallel processing. Obviously, I do not want to block access to the whole map for each transaction.

Whaty do you think about creating spin_mutexes within each of the individual objects to control access to them?

Thanks a lot.

0 Kudos
5 Replies
RafSchietekat
Black Belt
49 Views

Use tbb::concurrent_hash_map instead of std::map; an accessor is like a tbb::spin_rw_mutex (disregarding delete issues).
Dmitry_Vyukov
Valued Contributor I
49 Views

khlupin:

What do you think about creating spin_mutexes within each of the individual objects to control access to them?


If you don't modify the map itself (from your description it seems that it's the case), than it's perfectly Ok to protect only individual objects. I think it's a good choice to create spin_mutex per object. It's a kind of fine-grained locking.
Also you can consider to align and pad object so that each individual object will be situated on separate cache-line. This will eliminate false-sharing, and can greatly improve performance. (for modern Intel processors cache-line size is 64 bytes)


khlupin
Beginner
49 Views

Thank you for the advice. You are correct - the map will not be modified concurrently.Could you point me in the right direction to read up on aligning and padding the objects?

Regards,

Roman

Dmitry_Vyukov
Valued Contributor I
49 Views

khlupin:

Thank you for the advice. You are correct - the map will not be modified concurrently.Could you point me in the right direction to read up on aligning and padding the objects?




I think you can read "Intel 64 and IA-32 Architectures Optimization Reference Manual":
http://www.intel.com/products/processor/manuals/

8.4.5 Prevent Sharing of Modified Data and False-Sharing
8.4.6 Placement of Shared Synchronization Variable
8.6.2.1 Minimize Sharing of Data between Physical Processors

The main point is that object must be aligned to (i.e. object's address must start at) cache-line size, and no 2 objects must reside in one cache-line. Cache-line size is usually 64 bytes, but processor dependent.

Here is quick example:

struct X {
 int useful_data1;
 int useful_data2;
 char pad [cacheline_size - 2 * sizeof(int)];
};

// _aligned_malloc is MSVC specific function
X* array = (X*)_aligned_malloc(sizeof(X) * count_of_elements, cacheline_size);

This way every element resides in own cache-line.
Alexey_K_Intel3
Employee
49 Views

In TBB, cache_aligned_allocator is intended to do exactly that: no two objects allocated by it can ever share a cache line.
Reply