<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Bug in concurrent_queue in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Bug-in-concurrent-queue/m-p/889061#M3683</link>
    <description>Oh, sorry, this one must go to TBB forum.&lt;BR /&gt;&lt;BR /&gt;</description>
    <pubDate>Tue, 09 Sep 2008 17:44:53 GMT</pubDate>
    <dc:creator>Dmitry_Vyukov</dc:creator>
    <dc:date>2008-09-09T17:44:53Z</dc:date>
    <item>
      <title>Bug in concurrent_queue</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Bug-in-concurrent-queue/m-p/889060#M3682</link>
      <description>TBB 2.1 (20080825)&lt;BR /&gt;&lt;BR /&gt;Consider following code from concurrent_queue.cpp&lt;BR /&gt;&lt;BR /&gt;void concurrent_queue_base_v3::internal_push( const void* src ) {&lt;BR /&gt; [...]&lt;BR /&gt; // it's sequentially consistent atomic RMW,&lt;BR /&gt; // so we can consider that it's followed by #StoreLoad style memory fence&lt;BR /&gt; ticket k = r.tail_counter++;&lt;BR /&gt; [...]&lt;BR /&gt; // it's relaxed load&lt;BR /&gt; if( r.n_waiting_consumers&amp;gt;0 ) {&lt;BR /&gt; EnterCriticalSection( &amp;amp;r.mtx_items_avail );&lt;BR /&gt; // signal consumers&lt;BR /&gt; [...]&lt;BR /&gt; LeaveCriticalSection( &amp;amp;r.mtx_items_avail );&lt;BR /&gt; }&lt;BR /&gt; [...]&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;void concurrent_queue_base_v3::internal_pop( void* dst ) {&lt;BR /&gt; [...]&lt;BR /&gt; // it is really empty.. go to sleep FOREVER!&lt;BR /&gt; EnterCriticalSection( &amp;amp;r.mtx_items_avail );&lt;BR /&gt; // it's relaxed load, followed by relaxed store (or relaxed pseudo RMW)&lt;BR /&gt; r.n_waiting_consumers++;&lt;BR /&gt; // it's load-acquire&lt;BR /&gt; while( r.tail_counter&amp;lt;=k ) {&lt;BR /&gt; // block thread&lt;BR /&gt; [...]&lt;BR /&gt; }&lt;BR /&gt; LeaveCriticalSection( &amp;amp;r.mtx_items_avail );&lt;BR /&gt; [...]&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Here is a bunch of problems.&lt;BR /&gt;I. Most serious problem. This logic can lead to DEADLOCK.&lt;BR /&gt;Consider following execution sequence:&lt;BR /&gt;1. Producer increments tail_counter&lt;BR /&gt;2. Producer loads n_waiting_consumers, n_waiting_consumers==0, so it doesn't signal consumers&lt;BR /&gt;3. Consumer increments n_waiting_consumers&lt;BR /&gt;4. Consumer loads tail_counter, load returns stale value, consumer goes to sleep forever&lt;BR /&gt;To fix this you must issue #StoreLoad style memory fence after incrementing n_waiting_consumers.&lt;BR /&gt;&lt;BR /&gt;II. Because n_waiting_consumers participates in lock-free algorithm, i.e. not always protected by mutex, it's better to be declared volatile, in order to calm down compiler.&lt;BR /&gt;&lt;BR /&gt;III. n_waiting_consumers and n_consumers_to_wakeup declared as uint16_t, i.e. maximum number of waiting threads is 65k. Well, it's enough for any sane situation for a couple of years. But this can be not enough even today for all those insane situations in which insane user will decide to use TBB. On my 32-bit system with 1GB of physical memory I'm able to create 30719 threads.&lt;BR /&gt;I don't see reasons why n_waiting_consumers and n_consumers_to_wakeup declared as uint16_t, and not 
as uint32_t, they doesn't used with CAS, they doesn't affect critical data layout.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;And exactly the same situation with reverse signaling, i.e. when consumer signals producer about decreasing of queue's size.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Btw, Relacy Race Detector can easily reveal such errors (item I). Moreover unit-test can be as simple as: 2 threads, thread 1 enqueues single item, thread 2 dequeues single item. On this simple unit-test Relacy will be able to detect deadlock in a second. (SPIN/Promela won't detect, because it doesn't support relaxed memory models)&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 09 Sep 2008 17:39:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Bug-in-concurrent-queue/m-p/889060#M3682</guid>
      <dc:creator>Dmitry_Vyukov</dc:creator>
      <dc:date>2008-09-09T17:39:30Z</dc:date>
    </item>
    <item>
      <title>Re: Bug in concurrent_queue</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Bug-in-concurrent-queue/m-p/889061#M3683</link>
      <description>Oh, sorry, this one must go to TBB forum.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 09 Sep 2008 17:44:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Bug-in-concurrent-queue/m-p/889061#M3683</guid>
      <dc:creator>Dmitry_Vyukov</dc:creator>
      <dc:date>2008-09-09T17:44:53Z</dc:date>
    </item>
  </channel>
</rss>

