bug in concurrent_bounded_queue?

pp0 · ‎04-28-2010

Hi there,

The snippet below will crash after a while with an access violation somewhere below tbb_debug.dll!concurrent_queue_base_v3::internal_push() when running 1 tbb thread for consume() and 10 for produce(). This is reproducible on my Vista 64 Box (snippet is compiled to 32 bits/debug). I am using Studio 2008 SP1. Any ideas?

struct QueuedCall
{
void unblock() { myQueue.push(1); }
void block() {
int dummy;
myQueue.pop(dummy);
}

tbb::concurrent_bounded_queue myQueue;
};

tbb::concurrent_queue myPendingCalls;

void consume() {
do {
QueuedCall *call;
if (myPendingCalls.try_pop(call))
call->unblock();
} while (true);
}

void produce() {
for (;;) {
QueuedCall *c = new QueuedCall();
myPendingCalls.push(c);
c->block();
delete c;
}
}

Dmitry_Vyukov · ‎04-28-2010

Quoting pp0

The snippet below will crash after a while with an access violation somewhere below tbb_debug.dll!concurrent_queue_base_v3::internal_push() when running 1 tbb thread for consume() and 10 for produce(). This is reproducible on my Vista 64 Box (snippet is compiled to 32 bits/debug). I am using Studio 2008 SP1. Any ideas?

That's not surprising since you delete an item straight after enqueue, so the queue basically transfers dangling pointers between threads.

pp0 · ‎04-28-2010

How so? I am synchronizing by blocking on the producer and unblocking on the consumer.

Dmitry_Vyukov · ‎04-28-2010

Ah, I see. I missed that.

Then perhaps the problem is that an item (hence the myQueue queue) is deleted between an item become available for consuming and return from push(). You must guarantee the myQueue's life-time till return from push().

Consider schematic push() implementation:

... push(...)
{
...
make an item available for consuming
...
// at this point consumer pops the item and deletes the queue (this)
..
some more operations with this' members, which result in paging fault because 'this' is already deleted
}

Dmitry_Vyukov · ‎04-28-2010

Naive mutex-based queue will not have such a limitation, because actual enqueue and return from push() are atomic. Which may be not true for non-naive and/or non-mutex-based queue.

mwhenson · ‎04-28-2010

Quoting Dmitriy Vyukov

Then perhaps the problem is that an item (hence the myQueue queue) is deleted between an item become available for consuming and return from push(). You must guarantee the myQueue's life-time till return from push().

If that's true, the concurrent_queue is completely pointless for any heap pointer since a consumer could never be guaranteed to be able to call delete w/o some other form of synchronization with the producer.

Mike

pp0 · ‎04-28-2010

That is a great suggestion. It would be a weird implementation though. A cursory review of the code shows the last step in push() to be:

r.items_avail.notify( predicate_leq(k) );

Which seems to indicate that the push() would not be visible until it actually terminates. The code is not particularly easy to read though so I could be wrong.

-pp

Dmitry_Vyukov · ‎04-28-2010

> If that's true, the concurrent_queue is completely pointless for any heap pointer since a consumer could never be guaranteed to be able to call delete w/o some other form of synchronization with the producer.

Why you made such conclusion? You can indeed pass pointers and delete them on consumer side.
However, you must not delete the queue itself while there are threads still executing it's methods. That's the basic safety rule equally applicable to all types out there.

mwhenson · ‎04-28-2010

Quoting Dmitriy Vyukov

Why you made such conclusion? You can indeed pass pointers and delete them on consumer side.
However, you must not delete the queue itself while there are threads still executing it's methods. That's the basic safety rule equally applicable to all types out there.

You're right, I misunderstood what you were saying. I guess I expected (perhaps wrongly so) a bit more protection from thread issues in such a concurrency oriented product. All push / pop signals being set signifies to me that it should be safe to call the destructor.

Mike

Dmitry_Vyukov · ‎04-30-2010

Quoting mwhenson

You're right, I misunderstood what you were saying. I guess I expected (perhaps wrongly so) a bit more protection from thread issues in such a concurrency oriented product. All push / pop signals being set signifies to me that it should be safe to call the destructor.

I still can't make up my mind on this. Whether queues must support such usage patterns or not. But I am inclining to the answer No, they do not have to.

There is a lot of "strange" usage patterns a user can think up. For example consider:

queue* q = new queue;

q->enqueue(1);

// thread 1

if (false == q->dequeue())

delete q;

// thread 2

if (false == q->dequeue())

delete q;

So enqueue() must be strictly lineariazable too to support such pattern.

Then, it seems that for each "strange" use pattern there is a "normal" implementation that achieves the same.

Then, support for such "strange" use patterns can penalize performance for "normal" use patterns.

So I guess it's Ok to just prohibit such "strange" use patterns.

Dmitry_Vyukov · ‎04-30-2010

Humm... I've found an example which is both reasonable/realistic and not easy way to rewrite correctly:

[cpp]struct tcp_connection
{
  queue_t q;
  ...
};

void connection_thread(tcp_connection* c)
{
  for (;;)
  {
    msg_t* m = c->dequeue();
	if (m->type == MSG_DISCONNECT)
	{
	  delete c;
	  return;
	}
	process(m);
  }
}

void io_dispatch_thread()
{
  for (;;)
  {
    select(...);
	for_each(SOCKET s in read_set)
	{
	  recv(s, buf, size);
	  tcp_connection* c = find_connection(s);
	  c->enqueue(msg_t(MSG_PACKET, data, size));
	}
	for_each(SOCKET s in exception_set)
	{
	  tcp_connection* c = find_connection(s);
	  c->enqueue(msg_t(MSG_DISCONNECT));
	}
  }
}


[/cpp]

Dmitry_Vyukov · ‎04-30-2010

I am curious as to what TBB team thinks on this.

Alexey-Kukanov · ‎04-30-2010

Are not smart pointers the usual solution for this type of problems?

Dmitry_Vyukov · ‎04-30-2010

Quoting Alexey Kukanov (Intel)

Are not smart pointers the usual solution for this type of problems?

What type of problem?

The problem is that while one uses mutex-based queue he has no problems, and once he switches to TBB he starts getting chaotic memory corruptions and crashes.

mwhenson · ‎04-30-2010

Quoting Dmitriy Vyukov

Humm... I've found an example which is both reasonable/realistic and not easy way to rewrite correctly:

Correct me if I'm wrong, but the difference between this and the original example is that this would also be wrong w a mutex based queue.

Mike

Dmitry_Vyukov · ‎04-30-2010

Quoting mwhenson

Quoting Dmitriy Vyukov

Humm... I've found an example which is both reasonable/realistic and not easy way to rewrite correctly:
Correct me if I'm wrong, but the difference between this and the original example is that this would also be wrong w a mutex based queue.

Please, elaborate. I do not see why my example won't work with a mutex-based queue.

mwhenson · ‎04-30-2010

Quoting Dmitriy Vyukov

Please, elaborate. I do not see why my example won't work with a mutex-based queue.

Maybe I didn't see the correct problem then; can you explain how your example could fail?

Mike

Dmitry_Vyukov · ‎04-30-2010

dispatch thread posts disconnect message:
c->enqueue(msg_t(MSG_DISCONNECT));
but not yet return from enqueue()

Connection thread receives the message:
msg_t*m=c->dequeue();
if(m->type==MSG_DISCONNECT)

and deletes the queue:
deletec;

Now dispatch thread crashes in enqueue().

RafSchietekat · ‎04-30-2010

And now how it would still fail even with a smart pointer (with both producer and consumer holding a reference to the queue)? :-)

Dmitry_Vyukov · ‎04-30-2010

I guess it won't. But it won't make the queue linearizable either.

mwhenson · ‎04-30-2010

Quoting Raf Schietekat

And now how it would still fail even with a smart pointer (with both producer and consumer holding a reference to the queue)? :-)

It shouldn't, but that ignores the point that many people, especially to those with experience with a mutex based queue, will probably find this behavior initially surprising and not initially intuitive.

Mike