<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Problems with parallel_sort in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Problems-with-parallel-sort/m-p/792441#M398</link>
    <description>What is the reason for usage of tbb::concurrent_vector?&lt;BR /&gt;</description>
    <pubDate>Sun, 07 Nov 2010 11:13:47 GMT</pubDate>
    <dc:creator>Dmitry_Vyukov</dc:creator>
    <dc:date>2010-11-07T11:13:47Z</dc:date>
    <item>
      <title>Problems with parallel_sort</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Problems-with-parallel-sort/m-p/792439#M396</link>
      <description>Good day, colleagues!&lt;BR /&gt;&lt;BR /&gt;I'm newbie in concurrent programming and I've encountered a problem with parallel_sort. Currently I'm writing small program which is have to sort big binary files with limited amount of memory. &lt;BR /&gt;&lt;BR /&gt;At the first step, I'm reading file to be sorted, split file to chunks (for example 10 MB each) and sort each chunk. The problem is when I'm applying parallel_sort to chunk, it performs more than 3 times slower than std::sort. Could you advice me, what I'm doing wrong? Thank you in advance.&lt;BR /&gt;&lt;BR /&gt;Code is attached.&lt;BR /&gt;My machine is Core i7 860, compiler - Visual Studio 2010.&lt;BR /&gt;</description>
      <pubDate>Sat, 06 Nov 2010 20:03:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Problems-with-parallel-sort/m-p/792439#M396</guid>
      <dc:creator>usamytch</dc:creator>
      <dc:date>2010-11-06T20:03:13Z</dc:date>
    </item>
    <item>
      <title>Problems with parallel_sort</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Problems-with-parallel-sort/m-p/792440#M397</link>
      <description>Colleagues,&lt;BR /&gt;&lt;BR /&gt;may be my code was too tagled. I've created a new simple code where I just create a vector and concurrent vector (both are 1M integers) and sort them via std::sort and tbb::parallel_sort respectively. Running times are 1500 and 8000 CPU clocks respectively - std::sort is 5 times faster.&lt;BR /&gt;&lt;BR /&gt;What is the problem in my code?&lt;BR /&gt;&lt;BR /&gt;#include &lt;VECTOR&gt;&lt;BR /&gt;#include &lt;ALGORITHM&gt;&lt;BR /&gt;#include &lt;IOSTREAM&gt;&lt;BR /&gt;#include "tbb\parallel_sort.h"&lt;BR /&gt;#include "tbb\concurrent_vector.h"&lt;BR /&gt;#include &lt;TIME.H&gt;&lt;BR /&gt;&lt;BR /&gt;using std::vector;&lt;BR /&gt;using tbb::concurrent_vector;&lt;BR /&gt;using tbb::parallel_sort;&lt;BR /&gt;&lt;BR /&gt;const int SIZE = 1000000;&lt;BR /&gt;&lt;BR /&gt;void Generate_Vector (int size, vector&lt;INT&gt; * target) {&lt;BR /&gt; target-&amp;gt;resize(size);&lt;BR /&gt; for (int index = 0; index &amp;lt; size; ++index) {&lt;BR /&gt;  target-&amp;gt;at(index) = rand();&lt;BR /&gt; }&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;int main () {&lt;BR /&gt; srand (300);&lt;BR /&gt; vector&lt;INT&gt; serial;&lt;BR /&gt; Generate_Vector(SIZE, &amp;amp;serial);&lt;BR /&gt; concurrent_vector&lt;INT&gt; parallel (serial.begin(), serial.end());&lt;BR /&gt;&lt;BR /&gt; clock_t start, finish;&lt;BR /&gt; &lt;BR /&gt; start = clock();&lt;BR /&gt; std::sort(serial.begin(), serial.end());&lt;BR /&gt; finish = clock();&lt;BR /&gt; &lt;BR /&gt; std::cout &amp;lt;&amp;lt; "std::sort time is " &amp;lt;&amp;lt; finish - start &amp;lt;&amp;lt; std::endl;&lt;BR /&gt;&lt;BR /&gt; start = clock();&lt;BR /&gt; tbb::parallel_sort (parallel.begin(), parallel.end());&lt;BR /&gt; finish = clock();&lt;BR /&gt; &lt;BR /&gt; std::cout &amp;lt;&amp;lt; "parallel sort time is " &amp;lt;&amp;lt; finish - start &amp;lt;&amp;lt; std::endl;&lt;BR /&gt; &lt;BR /&gt; return 0;&lt;BR /&gt;}&lt;/INT&gt;&lt;/INT&gt;&lt;/INT&gt;&lt;/TIME.H&gt;&lt;/IOSTREAM&gt;&lt;/ALGORITHM&gt;&lt;/VECTOR&gt;</description>
      <pubDate>Sun, 07 Nov 2010 08:17:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Problems-with-parallel-sort/m-p/792440#M397</guid>
      <dc:creator>usamytch</dc:creator>
      <dc:date>2010-11-07T08:17:47Z</dc:date>
    </item>
    <item>
      <title>Problems with parallel_sort</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Problems-with-parallel-sort/m-p/792441#M398</link>
      <description>What is the reason for usage of tbb::concurrent_vector?&lt;BR /&gt;</description>
      <pubDate>Sun, 07 Nov 2010 11:13:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Problems-with-parallel-sort/m-p/792441#M398</guid>
      <dc:creator>Dmitry_Vyukov</dc:creator>
      <dc:date>2010-11-07T11:13:47Z</dc:date>
    </item>
    <item>
      <title>Problems with parallel_sort</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Problems-with-parallel-sort/m-p/792442#M399</link>
      <description>First I've tried to use std::vector, but it worked even slower, and CPU load was only 20-40% while with concurrent_vector it was 100%.&lt;BR /&gt;&lt;BR /&gt;Important update - all results above were derived from Debug configuration. When I switched to Release and used std::vector, all become OK - CPU times was 78 for std::sort and 26 for tbb::parallel sort.</description>
      <pubDate>Sun, 07 Nov 2010 11:28:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Problems-with-parallel-sort/m-p/792442#M399</guid>
      <dc:creator>usamytch</dc:creator>
      <dc:date>2010-11-07T11:28:29Z</dc:date>
    </item>
    <item>
      <title>Problems with parallel_sort</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Problems-with-parallel-sort/m-p/792443#M400</link>
      <description>Debug versions of STL have a LOT of additional non-scalable checks. For a example an STL container can have a mutex-protected sub-container of all iterators to into it, since it's mutex-protected, it's non-scalable.&lt;BR /&gt;If you are using MSVC try define:&lt;BR /&gt;# define _SECURE_SCL 0&lt;BR /&gt;# define _HAS_ITERATOR_DEBUGGING 0&lt;BR /&gt;# define _ITERATOR_DEBUG_LEVEL 0&lt;BR /&gt;</description>
      <pubDate>Sun, 07 Nov 2010 12:06:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Problems-with-parallel-sort/m-p/792443#M400</guid>
      <dc:creator>Dmitry_Vyukov</dc:creator>
      <dc:date>2010-11-07T12:06:44Z</dc:date>
    </item>
  </channel>
</rss>

