Re: Possible deadlock in TBB concurrent_hash_map

romain_c · ‎04-22-2008

Hello,

I came across the following bug in TBB (well, at least I think it is a bug). It is a very rare case but it did actually happen to me :(.

Here is the simplest version of it I could come up with:

Suppose two threads T1 and T2:

T1:
- acquires an accessor on bucket b1 in chain c1
- tries to insert a new element (not b1) into chain c1 (same hashing as for b1):
+ It first gets a read lock on the chain c1 and then tries to upgrade to a writer lock to insert the element

T2:
- tries to get an accessor to bucket b1
+ It gets a read lock on the chain c1
+ It tries to get a write lock on b1

This will deadlock because:
- T2 can't make progress because T1 is already holding b1 (that is expected behavior)
- T1 can't make any progress because T2 is still holding the chain lock to c1 in read mode which prevents T1 from upgrading it to a writer lock. This is the part that I find to be questionable.

What I did to fix it was simply insert a release statement on the chain lock before trying to acquire the bucket lock. This *may* cause problems (in particular if the chain changes or the bucket is erased) but I am not sure yet. Anyways, right now it works for me :)

Romain

RafSchietekat · ‎04-23-2008

It always helps to specify a version (there was quite a change in tbb20_20080402oss_src); because of the names that you use, this would be before that version (which also corrected some of the terminology), so I will assume tbb20_20080319oss_src.

I find no flaw in your analysis other than what you find questionable: instead, T1 must not try to do anything to the map while holding the accessor it already has (although it would help if this were documented explicitly: more can block in this implementation than just operations on a key already being accessed, e.g., ultimately all inserts on the same segment will block once they determine they need to grow the array).

A fix that "may" cause problems is no fix at all, and you are quite right that it will fail if another thread intervenes and erases the item (other changes to the chain don't matter); if your program can exclude this failure mode you may get away with it on your own responsibility, but it is not generally applicable. The change in tbb20_20080402oss_src (kept in the current stable version tbb20_20080408oss_src) will not help you further along.

The change that I proposed in the thread "concurrent_hash_map::erase" would not deadlock but, while working, is currently still significantly slower (though I'm still hopeful about that).

romain_c · ‎04-23-2008

Hello, Yes, sorry I did not specify the version. I am actually using the latest (20080408 if I am not mistaken). I will look at the change you propose in the thread cited to see if I can use that. The reason I was trying to access multiple elements within the map in the same thread is because I am using accessors as implicit locks around the mapped values in the map. In the example I gave, T1 needs to maintain a hold on b1 (to prevent concurrent access) while doing the second operation (insertion). I am a total newbee at TBB (just started playing around with it) and I probably didn't use it in the intended manner :). I guess I can correct this by actually having explicit locks inside each of the mapped values and grabbing those but for now it works :) (I do guarantee that no other thread can destroy the element if it is being held, at least I try to). Thanks for the clarification, Romain PS: Out of curiosity, what terminology changed?

Alexey-Kukanov · ‎04-23-2008

In general I concur with Raf that you better should not insert another item into the hash map while already holding an accessor to one of the items.

The gold rule of multithreaded programming is to hold a lock as minimally as possible to preserve some invariants and consistency of the observable program state. In particular, a good advice is to never call a 3-rd party function from a locked code section unless you are sure (e.g. from source code inspection or documentation) that this function does not use any locks. Stepping back from this advice is fraught with deadlocks and/or performance problems. Concurrent_hash_map accessors are locks and should be regarded as such.

We will try to improve the locking behavior to possibly prevent such situations like you described; in fact there are also performance considerationsinvolved and thus we ought to make it a try. I would not promise, however, that it will be fixed; if not, we will need to document that inserting an element into the map while holding accessor to another its element may cause undefined behavior.

RafSchietekat · ‎04-23-2008

I have not published those changes (nor will I until the performance issue has been exhausted), only written about them. I do not have enough insight in what you want to achieve, but getting a lock inside the mapped values does not seem likely to be a solution unless you override the const protection on a const_accessor and take the performance hit of a double lock (maybe I'll use that as an alternative implementation). The word "bucket" does not occur anywhere in tbb20_20080408oss_src/include/tbb/concurrent_hash_map.h, it is now a "node", only the single-letter variable name b remains.

romain_c · ‎04-24-2008

Hello,

Thank you all for your replies. They were most helpful.

Alexey:
Yes, I agree that it is not a good idea to hold multiple locks because of potential deadlock scenarios. As you mention, TBB accessors are locks and should be treated as such. In my case, I was actually using them as such. Here is what I am actually doing:

I have a concurrent_hash_map that stores references to hierarchical objects (in my case, they are thread objects as I am trying to do something akin to TLS for my research). The objects have parents and children and as such have a partial order on themselves. To remove the possibility of deadlocks, I enforce the policy: "if a thread must hold multiple accessors, it must acquire them in hierarchical order". Furthermore, the particular example I was giving was in the case of creating a new child object. T1 was the thread trying to create a new child object Ch1 and as such held a lock on the parent object Pa1. T2 was also an existing child Ch2 of Pa1 and was just trying to get access to it.
I don't know if what I described is a very common paradigm and there may be ways of doing it better. It would be easy to change the behavior by releasing the accessor on Pa1 when inserting Ch1 and then reaquiring the lock on Pa1 after successful insertion (and release of Ch1) but that would have a performance hit as well (two acquisition of locks).

Raf:

Definitely, putting an extra lock in the structure would have a big performance overhead.
Ah yes, node did replace the 'bucket' term :). I had started looking at the code before it did and I guess my mind was still on the old association :)

Thanks again for all your help,

Romain

Anton_M_Intel · ‎04-25-2008

Romain,

I've fixed the issue, and it should be available in coming packages. But it'd be better to avoid such scenarios anyway to reduce contention.

Intel_C_8 · ‎04-30-2008

I think I've found another deadlock in concurrent_hash_map::insert().

This manifested in two ways:

1. If insert fails (because the item was already inserted), the accessor would still hold a lock on the item until it went out of scope and was destructed.

2. Concurrent insert() of the samekey would deadlock in upgrade_to_writer().

I tracked both down to this piece of code in insert(), which always locks the node regardless of the result:

done:

result->my_lock.acquire( b->mutex, write );

result->my_node = b;

return return_value;

Changing this to:

done:

if(return_value) {

result->my_lock.acquire( b->mutex, write );

result->my_node = b;

}

return return_value;

Appears to work. Is this correct?

regards

Simon Cope

RafSchietekat · ‎04-30-2008

It always helps to specify a version, but from the pieces of code you mention I will assume this is about tbb20_20080408oss_src, the most recent release (promoted from development to stable).

1. An "unsuccessful" insert() and a successful find() are indistinguishable (except for their opposite return value).
2. A concurrent insert() of the same key would not reach upgrade_to_write() because it would also "fail", and this would be so whether executed from the same (not a good idea) or from a different thread.

What deadlock condition did I fail to see? (Notice the formal difference with the first posting in this thread, which shows exactly where two threads cannot make progress.)

The piece of code (from lookup(), not insert()) is executed for either insert() (whether it inserted a new node or not) or successful find(). Making the suggested code change would leave the accessor empty on an insert() for an item that already exists, so many/most uses of insert() would need to be replaced with loops that alternatively insert() and find() an item until one of the two succeeds, which seems like a waste of performance even if the first find() succeeds. If you want insert() not to leave anything else blocked, merely release the accessor (implicitly or explicitly) right after insert() returns (an accessor should be released right after its business is complete, and if there is no business that would be immediately).

The only "problem" I can see here is that a concurrent set is not as cheap as it could be, implemented on top of concurrent_hash_map (there is always a mapped value that is then ignored, and an insertion will involve a redundant lock cycle).

Intel_C_8 · ‎04-30-2008

Raf_Schietekat:
Making the suggested code change would leave the accessor empty on an insert() for an item that already exists, so many/most uses of insert() would need to be replaced with loops that alternatively insert() and find() an item until one of the two succeeds, which seems like a waste of performance even if the first find() succeeds. If you want insert() not to leave anything else blocked, merely release the accessor (implicitly or explicitly) right after insert() returns (an accessor should be released right after its business is complete, and if there is no business that would be immediately).

hi Raf,

I think this may be down to interpretation ofthe doco.

For insert() it says:

"Search table for pair with given key. If not present, insert new pair into table. The

new pair is initialized with pair(key,T()). Set result to provide write access to the

matching pair."

I've read that to mean that the accessor is only set when the insert is successful, but from the behaviour I'm seeing I guess that is incorrect. It seems a bit odd that a function does something critical even though it "fails".

As to havingto dofind/insert loop, I think that happens more often that you are assuming. For instance, all through my code I need a ::const_accessto an item, and if it doesn't exist in the map insert it and setup the values (so awrite ::accessor insert, but still return a ::const_accessor. So you have to do:

while(!find(const_accessor, key)) {

if(insert(accessor, key)) {

// setup item

accessor->second.blah = blah;

accessor.release();

}

return(true)

Otherwise you are holding write ::accessors to everything which impedes concurrency. The insert(const_accessor, key) method seems pointless since you have no way to safely initialise the node. If there was a mechanism to upgrade/downgrede between accessor and const_accessor this could be avoided (unless I'm missing something).

Also my #2 case vanished after some code changes, so I think it may have been bogus.

thanks

Simon Cope

RafSchietekat · ‎05-01-2008

With std::map you also get an iterator if nothing is inserted.

(Added) What is the reason for using a const_accessor instead of a separate copy: merely avoiding the overhead of copying (I would instead favour something like a new copy-on-write snapshot type myself, or a homegrow equivalent), keeping the item locked inside the map (but unless there's a user uprising, concurrent_hash_map might not bring back that guarantee which it removed in tbb20_20080402oss_src), or both? Note that either const_accessor or accessor should be as short-lived as possible, unless you make certain that you only open const_accessors on the item, take the responsibility of assuming that this will continue to be costless, and don't tell anyone I told you that (I think).

As for converting between accessor and const_accessor, how about just making a good case for any proposal like that: TBB is still a work in progress (other changes have already been officially announced), and IMHO you can either wait for whatever will happen, or give input.

Alexey-Kukanov · ‎05-01-2008

simon.cope@lggi.com:
As to havingto dofind/insert loop, I think that happens more often that you are assuming. For instance, all through my code I need a ::const_accessto an item, and if it doesn't exist in the map insert it and setup the values (so awrite ::accessor insert, but still return a ::const_accessor. ... Otherwise you are holding write ::accessors to everything which impedes concurrency. The insert(const_accessor, key) method seems pointless since you have no way to safely initialise the node. If there was a mechanism to upgrade/downgrede between accessor and const_accessor this could be avoided (unless I'm missing something).

I think concurrent_hash_map will have a solution for you problem soon. An additional insert method currently is being implemented which allows insertion of pair and does not return an accessor. The code on your side would probably be like in the following sketch:

while(!find(const_accessor, key))
insert(std::make_pair(key, value));

and might be you even can replace while to if in case you know the item can't be immediately deleted by another thread.

Anton_M_Intel · ‎05-05-2008

A couple of new overloaded insert() methods are going to be introduced soon. So, you can write just:

hash_map.insert(const_accessor, std::make_pair(key, value) );

to insert the pair unless it exists, and acquire read lock on new or existing item