<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: QuickSort OpenMP Parallel in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857535#M2166</link>
    <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
clock(), on many platforms, adds up time spent in all threads, so the usual objective for threaded parallel is to reduce elapsed time at the cost of some increase in the time measured by clock(). Your OpenMP should make a suitable choice of timing function in omp_get_wtime() (typically, gettimeofday()), or, for finer resolution, you can use rdtsc counter with precautions against comparing different timers on a multiple socket platform.&lt;BR /&gt;</description>
    <pubDate>Thu, 16 Apr 2009 18:57:57 GMT</pubDate>
    <dc:creator>TimP</dc:creator>
    <dc:date>2009-04-16T18:57:57Z</dc:date>
    <item>
      <title>QuickSort OpenMP Parallel</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857534#M2165</link>
      <description>why QuickSort OpenMP Parallel&lt;BR /&gt;int _tmain(int argc, _TCHAR* argv[])&lt;BR /&gt;{&lt;BR /&gt; &lt;BR /&gt;&lt;BR /&gt; begin=clock();///CLK_TCK;&lt;BR /&gt;&lt;BR /&gt; quicksort(R,0,N-1);&lt;BR /&gt; &lt;BR /&gt; printf("");&lt;BR /&gt; for(i=0;i&lt;N&gt;&lt;/N&gt; printf("%d ",R&lt;I&gt;);&lt;BR /&gt; printf("\n");&lt;BR /&gt; end=clock();//CLK_TCK;&lt;BR /&gt; printf("time=%lf\n",end-begin);&lt;BR /&gt; &lt;BR /&gt; &lt;BR /&gt; return 0;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;int partition(int R[],int l,int h)&lt;BR /&gt;{&lt;BR /&gt; int i,j,temp;&lt;BR /&gt; i=l;j=h;temp=R&lt;I&gt;;&lt;BR /&gt; do&lt;BR /&gt; {&lt;BR /&gt; for(;(R&lt;J&gt;&amp;gt;=temp)&amp;amp;&amp;amp;(i&lt;J&gt;&lt;/J&gt; if(i&lt;J&gt;;&lt;BR /&gt; for(;(R&lt;I&gt;&amp;lt;=temp)&amp;amp;&amp;amp;(i&lt;J&gt;&lt;/J&gt; if(i&lt;J&gt;;&lt;BR /&gt; }while(i!=j);&lt;BR /&gt; R&lt;I&gt;=temp;&lt;BR /&gt; return i;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;void quicksort(int R[],int s1,int t1)&lt;BR /&gt;{&lt;BR /&gt; int i;&lt;BR /&gt; if(s1&lt;T1&gt;&lt;/T1&gt; {&lt;BR /&gt; i=partition(R,s1,t1);&lt;BR /&gt; //#pragma omp parallel sections&lt;BR /&gt; {&lt;BR /&gt; #pragma omp section&lt;BR /&gt; quicksort(R,s1,i-1);&lt;BR /&gt; #pragma omp section&lt;BR /&gt; quicksort(R,i+1,t1);&lt;BR /&gt; }&lt;BR /&gt; }&lt;BR /&gt;}&lt;BR /&gt; why parallel time is slower than serial time ?  please tell me whether there are better time funtions than clock() or not ? Thank you&lt;/I&gt;&lt;/J&gt;&lt;/I&gt;&lt;/J&gt;&lt;/J&gt;&lt;/I&gt;&lt;/I&gt;</description>
      <pubDate>Thu, 16 Apr 2009 17:35:02 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857534#M2165</guid>
      <dc:creator>zhangzhe65</dc:creator>
      <dc:date>2009-04-16T17:35:02Z</dc:date>
    </item>
    <item>
      <title>Re: QuickSort OpenMP Parallel</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857535#M2166</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
clock(), on many platforms, adds up time spent in all threads, so the usual objective for threaded parallel is to reduce elapsed time at the cost of some increase in the time measured by clock(). Your OpenMP should make a suitable choice of timing function in omp_get_wtime() (typically, gettimeofday()), or, for finer resolution, you can use rdtsc counter with precautions against comparing different timers on a multiple socket platform.&lt;BR /&gt;</description>
      <pubDate>Thu, 16 Apr 2009 18:57:57 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857535#M2166</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-04-16T18:57:57Z</dc:date>
    </item>
    <item>
      <title>Re: QuickSort OpenMP Parallel</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857536#M2167</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/367365"&gt;tim18&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;clock(), on many platforms, adds up time spent in all threads, so the usual objective for threaded parallel is to reduce elapsed time at the cost of some increase in the time measured by clock(). Your OpenMP should make a suitable choice of timing function in omp_get_wtime() (typically, gettimeofday()), or, for finer resolution, you can use rdtsc counter with precautions against comparing different timers on a multiple socket platform.&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;On Windows I tend to use QueryPerformanceCounter() with QueryPerformanceFrequency() for fine resolution timings. RDTSC is nice, but you need to use asm to access it, and as Tim mentioned, you need to make sure all calls to it come from the same socket.&lt;BR /&gt;&lt;BR /&gt;Tim, I have been assuming that multiple cores on a single die share the same RDTSC timer. Do you know if this is correct?&lt;BR /&gt;&lt;BR /&gt;Peter&lt;BR /&gt;</description>
      <pubDate>Fri, 17 Apr 2009 10:51:02 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857536#M2167</guid>
      <dc:creator>pvonkaenel</dc:creator>
      <dc:date>2009-04-17T10:51:02Z</dc:date>
    </item>
    <item>
      <title>Re: QuickSort OpenMP Parallel</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857537#M2168</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
Microsoft and Intel C/C++ compilers for x86/x64 provide an intrinsic __rdtsc() so it is not necessariy to know the asm sequences. I don't know if the SSE4.2 option might in the future switch it over to the new equivalent instruction, or whether the old rdtsc will continue indefinitely to give equivalent results. There is a single timer shared by multiple cores on a CPU.&lt;BR /&gt;On Intel 64-bit CPUs (from Prescott on) rdtsc actually counts FSB ticks, and on "Intel microarchitecture code-named Nehalem," QPI ticks, with the reported result scaled in accordance with the nominal core speed. For example, the rdtsc rate doesn't speed up under Turbo mode.&lt;BR /&gt;Power-on BIOS logic synchronizes the timers on a multiple socket platform, so they shouldn't differ by more than a few FSB ticks, as long as they share the same FSB clock. So, on ordinary dual CPU platforms, the discrepancy introduced by accidental comparison of the 2 different timers would be within the time delay of switching threads.&lt;BR /&gt;</description>
      <pubDate>Fri, 17 Apr 2009 13:12:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857537#M2168</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-04-17T13:12:30Z</dc:date>
    </item>
    <item>
      <title>Re: QuickSort OpenMP Parallel</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857538#M2169</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;Concerning the question, why the serial version is faster than the parallel, you might need to have a closer look at the overhead. The first time the parallel section is executed, the OpenMP run-time environment starts the thread pool, which needs a considerable amount of time. For time measurements, it is therefore advisable to havea warm-up call to the parallel routine &lt;EM&gt;before &lt;/EM&gt;the time measurement.&lt;/P&gt;
&lt;P&gt;Additionally, you should take into account that parallelization is only beneficial for large data sets. If you sort only 100 integers, you are probably better off using a serial algorithm. If you haven't done so already, you should choose a big data set to time your sort routine. Additionally, you shouldswitch tothe serial variant in your recursion, if the data sets become too small.&lt;BR /&gt;&lt;BR /&gt;You should also be aware that the maximal possible speed-up of a parallel quicksort algorithm is limited by the first partitioning, which is serial. Amdahl's law might bite you, especially if you are running on a machine with a lot of cores.&lt;BR /&gt;&lt;BR /&gt;If you are completely stuck, the Intel Thread Profiler can help you in visualizing the thread behavior.&lt;BR /&gt;&lt;BR /&gt;Kind regards&lt;BR /&gt;Thomas&lt;/P&gt;</description>
      <pubDate>Fri, 17 Apr 2009 19:45:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857538#M2169</guid>
      <dc:creator>Thomas_W_Intel</dc:creator>
      <dc:date>2009-04-17T19:45:25Z</dc:date>
    </item>
    <item>
      <title>Re: QuickSort OpenMP Parallel</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857539#M2170</link>
      <description>&lt;DIV style="margin:0px;"&gt;working on a quick sort parallel prgm with increased efficiency...once completed, i'll put it on ISN,...&lt;/DIV&gt;
&lt;BR /&gt;</description>
      <pubDate>Wed, 22 Apr 2009 14:56:08 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857539#M2170</guid>
      <dc:creator>pramodblackbird</dc:creator>
      <dc:date>2009-04-22T14:56:08Z</dc:date>
    </item>
    <item>
      <title>Re: QuickSort OpenMP Parallel</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857540#M2171</link>
      <description>&lt;DIV&gt;&lt;/DIV&gt;
When I do timing tests I always try to keep the I/O (printf's in this case) out of the area that is timed. They can take a (relatively) large amount of time, and the time they take can be somewhat arbitrary (based on how full buffers are, the status of the console, etc). It looks like you're actually trying to time the quicksort() call, so just put your timer around that.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 22 Apr 2009 15:03:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857540#M2171</guid>
      <dc:creator>teeks99</dc:creator>
      <dc:date>2009-04-22T15:03:14Z</dc:date>
    </item>
    <item>
      <title>Re: QuickSort OpenMP Parallel</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857541#M2172</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/243211"&gt;pramodblackbird&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;working on a quick sort parallel prgm with increased efficiency...once completed, i'll put it on ISN,...&lt;BR /&gt;&lt;/DIV&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;Are you aware that Intel already provides a parallel quick sort as part of the Open Source/commercial Intel Threading Building Blocks? The parallel_sort function recursively partitions the sort range and then spawns work on those partitions as separate tasks, scheduled in the Intel TBB thread pool.</description>
      <pubDate>Wed, 22 Apr 2009 15:38:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857541#M2172</guid>
      <dc:creator>robert-reed</dc:creator>
      <dc:date>2009-04-22T15:38:35Z</dc:date>
    </item>
    <item>
      <title>Re: QuickSort OpenMP Parallel</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857542#M2173</link>
      <description>&lt;DIV style="margin:0px;"&gt;thnx a lot...&lt;/DIV&gt;
&lt;BR /&gt;</description>
      <pubDate>Wed, 22 Apr 2009 17:30:31 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/QuickSort-OpenMP-Parallel/m-p/857542#M2173</guid>
      <dc:creator>pramodblackbird</dc:creator>
      <dc:date>2009-04-22T17:30:31Z</dc:date>
    </item>
  </channel>
</rss>

