- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I came across the following bug in TBB (well, at least I think it is a bug). It is a very rare case but it did actually happen to me :(.
Here is the simplest version of it I could come up with:
Suppose two threads T1 and T2:
T1:
- acquires an accessor on bucket b1 in chain c1
- tries to insert a new element (not b1) into chain c1 (same hashing as for b1):
+ It first gets a read lock on the chain c1 and then tries to upgrade to a writer lock to insert the element
T2:
- tries to get an accessor to bucket b1
+ It gets a read lock on the chain c1
+ It tries to get a write lock on b1
This will deadlock because:
- T2 can't make progress because T1 is already holding b1 (that is expected behavior)
- T1 can't make any progress because T2 is still holding the chain lock to c1 in read mode which prevents T1 from upgrading it to a writer lock. This is the part that I find to be questionable.
What I did to fix it was simply insert a release statement on the chain lock before trying to acquire the bucket lock. This *may* cause problems (in particular if the chain changes or the bucket is erased) but I am not sure yet. Anyways, right now it works for me :)
Romain
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I find no flaw in your analysis other than what you find questionable: instead, T1 must not try to do anything to the map while holding the accessor it already has (although it would help if this were documented explicitly: more can block in this implementation than just operations on a key already being accessed, e.g., ultimately all inserts on the same segment will block once they determine they need to grow the array).
A fix that "may" cause problems is no fix at all, and you are quite right that it will fail if another thread intervenes and erases the item (other changes to the chain don't matter); if your program can exclude this failure mode you may get away with it on your own responsibility, but it is not generally applicable. The change in tbb20_20080402oss_src (kept in the current stable version tbb20_20080408oss_src) will not help you further along.
The change that I proposed in the thread "concurrent_hash_map::erase" would not deadlock but, while working, is currently still significantly slower (though I'm still hopeful about that).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In general I concur with Raf that you better should not insert another item into the hash map while already holding an accessor to one of the items.
The gold rule of multithreaded programming is to hold a lock as minimally as possible to preserve some invariants and consistency of the observable program state. In particular, a good advice is to never call a 3-rd party function from a locked code section unless you are sure (e.g. from source code inspection or documentation) that this function does not use any locks. Stepping back from this advice is fraught with deadlocks and/or performance problems. Concurrent_hash_map accessors are locks and should be regarded as such.
We will try to improve the locking behavior to possibly prevent such situations like you described; in fact there are also performance considerationsinvolved and thus we ought to make it a try. I would not promise, however, that it will be fixed; if not, we will need to document that inserting an element into the map while holding accessor to another its element may cause undefined behavior.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you all for your replies. They were most helpful.
Alexey:
Yes, I agree that it is not a good idea to hold multiple locks because of potential deadlock scenarios. As you mention, TBB accessors are locks and should be treated as such. In my case, I was actually using them as such. Here is what I am actually doing:
I have a concurrent_hash_map that stores references to hierarchical objects (in my case, they are thread objects as I am trying to do something akin to TLS for my research). The objects have parents and children and as such have a partial order on themselves. To remove the possibility of deadlocks, I enforce the policy: "if a thread must hold multiple accessors, it must acquire them in hierarchical order". Furthermore, the particular example I was giving was in the case of creating a new child object. T1 was the thread trying to create a new child object Ch1 and as such held a lock on the parent object Pa1. T2 was also an existing child Ch2 of Pa1 and was just trying to get access to it.
I don't know if what I described is a very common paradigm and there may be ways of doing it better. It would be easy to change the behavior by releasing the accessor on Pa1 when inserting Ch1 and then reaquiring the lock on Pa1 after successful insertion (and release of Ch1) but that would have a performance hit as well (two acquisition of locks).
Raf:
Definitely, putting an extra lock in the structure would have a big performance overhead.
Ah yes, node did replace the 'bucket' term :). I had started looking at the code before it did and I guess my mind was still on the old association :)
Thanks again for all your help,
Romain
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've fixed the issue, and it should be available in coming packages. But it'd be better to avoid such scenarios anyway to reduce contention.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think I've found another deadlock in concurrent_hash_map::insert().
This manifested in two ways:
1. If insert fails (because the item was already inserted), the accessor would still hold a lock on the item until it went out of scope and was destructed.
2. Concurrent insert() of the samekey would deadlock in upgrade_to_writer().
I tracked both down to this piece of code in insert(), which always locks the node regardless of the result:
done:
result->my_lock.acquire( b->mutex, write );
result->my_node = b;
return return_value;Changing this to:
done:
if(return_value) {result->my_lock.acquire( b->mutex, write );
result->my_node = b;
}
return return_value;Appears to work. Is this correct?
regards
Simon Cope
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. An "unsuccessful" insert() and a successful find() are indistinguishable (except for their opposite return value).
2. A concurrent insert() of the same key would not reach upgrade_to_write() because it would also "fail", and this would be so whether executed from the same (not a good idea) or from a different thread.
What deadlock condition did I fail to see? (Notice the formal difference with the first posting in this thread, which shows exactly where two threads cannot make progress.)
The piece of code (from lookup(), not insert()) is executed for either insert() (whether it inserted a new node or not) or successful find(). Making the suggested code change would leave the accessor empty on an insert() for an item that already exists, so many/most uses of insert() would need to be replaced with loops that alternatively insert() and find() an item until one of the two succeeds, which seems like a waste of performance even if the first find() succeeds. If you want insert() not to leave anything else blocked, merely release the accessor (implicitly or explicitly) right after insert() returns (an accessor should be released right after its business is complete, and if there is no business that would be immediately).
The only "problem" I can see here is that a concurrent set is not as cheap as it could be, implemented on top of concurrent_hash_map (there is always a mapped value that is then ignored, and an insertion will involve a redundant lock cycle).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Raf_Schietekat:Making the suggested code change would leave the accessor empty on an insert() for an item that already exists, so many/most uses of insert() would need to be replaced with loops that alternatively insert() and find() an item until one of the two succeeds, which seems like a waste of performance even if the first find() succeeds. If you want insert() not to leave anything else blocked, merely release the accessor (implicitly or explicitly) right after insert() returns (an accessor should be released right after its business is complete, and if there is no business that would be immediately).
hi Raf,
I think this may be down to interpretation ofthe doco.
For insert() it says:
"Search table for pair with given key. If not present, insert new pair into table. The
new pair is initialized with pair(key,T()). Set result to provide write access to the
matching pair."
I've read that to mean that the accessor is only set when the insert is successful, but from the behaviour I'm seeing I guess that is incorrect. It seems a bit odd that a function does something critical even though it "fails".
As to havingto dofind/insert loop, I think that happens more often that you are assuming. For instance, all through my code I need a ::const_accessto an item, and if it doesn't exist in the map insert it and setup the values (so awrite ::accessor insert, but still return a ::const_accessor. So you have to do:
while(!find(const_accessor, key)) {
if(insert(accessor, key)) {
// setup item
accessor->second.blah = blah;
accessor.release();
}
}
return(true)
Otherwise you are holding write ::accessors to everything which impedes concurrency. The insert(const_accessor, key) method seems pointless since you have no way to safely initialise the node. If there was a mechanism to upgrade/downgrede between accessor and const_accessor this could be avoided (unless I'm missing something).
Also my #2 case vanished after some code changes, so I think it may have been bogus.
thanks
Simon Cope
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(Added) What is the reason for using a const_accessor instead of a separate copy: merely avoiding the overhead of copying (I would instead favour something like a new copy-on-write snapshot type myself, or a homegrow equivalent), keeping the item locked inside the map (but unless there's a user uprising, concurrent_hash_map might not bring back that guarantee which it removed in tbb20_20080402oss_src), or both? Note that either const_accessor or accessor should be as short-lived as possible, unless you make certain that you only open const_accessors on the item, take the responsibility of assuming that this will continue to be costless, and don't tell anyone I told you that (I think).
As for converting between accessor and const_accessor, how about just making a good case for any proposal like that: TBB is still a work in progress (other changes have already been officially announced), and IMHO you can either wait for whatever will happen, or give input.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
simon.cope@lggi.com:As to havingto dofind/insert loop, I think that happens more often that you are assuming. For instance, all through my code I need a ::const_accessto an item, and if it doesn't exist in the map insert it and setup the values (so awrite ::accessor insert, but still return a ::const_accessor. ... Otherwise you are holding write ::accessors to everything which impedes concurrency. The insert(const_accessor, key) method seems pointless since you have no way to safely initialise the node. If there was a mechanism to upgrade/downgrede between accessor and const_accessor this could be avoided (unless I'm missing something).
I think concurrent_hash_map will have a solution for you problem soon. An additional insert method currently is being implemented which allows insertion of
while(!find(const_accessor, key))
insert(std::make_pair(key, value));
and might be you even can replace while to if in case you know the item can't be immediately deleted by another thread.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hash_map.insert(const_accessor, std::make_pair(key, value) );
to insert the pair unless it exists, and acquire read lock on new or existing item
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page