<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic In case you're going about in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/omp-get-thread-num-returns-random-values-in-the-parallel-region/m-p/952541#M5210</link>
    <description>&lt;P&gt;In case you're going about this by trial and error (understandable, given the scattered documentation on OpenMP), you must compare what you got with saner variants such as&lt;/P&gt;
&lt;P&gt;#pragma omp parallel num_threads(nThread)&lt;BR /&gt;{&lt;/P&gt;
&lt;P&gt;#pragma omp single&lt;/P&gt;
&lt;P&gt;{&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; int myID=omp_get_thread_num ();&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("Thread's ID %d \n", myID);&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/P&gt;
&lt;P&gt;}&lt;/P&gt;
&lt;P&gt;I would agree that the IBM doc (but not the Microsoft one) would appear to justify your way of doing it.&lt;/P&gt;</description>
    <pubDate>Thu, 10 Oct 2013 12:35:19 GMT</pubDate>
    <dc:creator>TimP</dc:creator>
    <dc:date>2013-10-10T12:35:19Z</dc:date>
    <item>
      <title>omp_get_thread_num() returns random values in the parallel region</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/omp-get-thread-num-returns-random-values-in-the-parallel-region/m-p/952540#M5209</link>
      <description>&lt;P&gt;Hello OpenMp professionals&lt;/P&gt;
&lt;P&gt;I m working in the parallel region with openmp, and i get a random thread's ID (not in order) example for a number of threads =4, so i get Thread's ID =1,Thread's ID =3,Thread's ID =2,Thread's ID =0, and for another execution i get another order.How to get the order of IDs eq to 0, 1, 2, 3. Any satisfactory answer would be welcome.&lt;/P&gt;
&lt;P&gt;Here is my code:&lt;/P&gt;
&lt;P&gt;int nThread = omp_get_max_threads ();&lt;/P&gt;
&lt;P&gt;#pragma omp parallel num_threads(nThread)&lt;BR /&gt;{&lt;BR /&gt;int myID=omp_get_thread_num ();&lt;/P&gt;
&lt;P&gt;printf("Thread's ID %d \n", myID);&lt;/P&gt;
&lt;P&gt;}&lt;/P&gt;
&lt;P&gt;Thanks for your reply&lt;/P&gt;</description>
      <pubDate>Thu, 10 Oct 2013 04:59:17 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/omp-get-thread-num-returns-random-values-in-the-parallel-region/m-p/952540#M5209</guid>
      <dc:creator>MooN_K_</dc:creator>
      <dc:date>2013-10-10T04:59:17Z</dc:date>
    </item>
    <item>
      <title>In case you're going about</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/omp-get-thread-num-returns-random-values-in-the-parallel-region/m-p/952541#M5210</link>
      <description>&lt;P&gt;In case you're going about this by trial and error (understandable, given the scattered documentation on OpenMP), you must compare what you got with saner variants such as&lt;/P&gt;
&lt;P&gt;#pragma omp parallel num_threads(nThread)&lt;BR /&gt;{&lt;/P&gt;
&lt;P&gt;#pragma omp single&lt;/P&gt;
&lt;P&gt;{&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; int myID=omp_get_thread_num ();&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("Thread's ID %d \n", myID);&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/P&gt;
&lt;P&gt;}&lt;/P&gt;
&lt;P&gt;I would agree that the IBM doc (but not the Microsoft one) would appear to justify your way of doing it.&lt;/P&gt;</description>
      <pubDate>Thu, 10 Oct 2013 12:35:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/omp-get-thread-num-returns-random-values-in-the-parallel-region/m-p/952541#M5210</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2013-10-10T12:35:19Z</dc:date>
    </item>
    <item>
      <title>Moreover, you can get mixed</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/omp-get-thread-num-returns-random-values-in-the-parallel-region/m-p/952542#M5211</link>
      <description>&lt;P&gt;Moreover, without this pragma (#pragma omp single) you can get mixed output like below since you are running a parallel program&lt;/P&gt;
&lt;P&gt;Thread's IDThread's ID 1&amp;nbsp;0&lt;BR /&gt;ThreThread's ID 2&lt;BR /&gt;ad's ID 3&amp;nbsp;&lt;/P&gt;
&lt;P&gt;--Vladimir&lt;/P&gt;</description>
      <pubDate>Thu, 10 Oct 2013 12:53:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/omp-get-thread-num-returns-random-values-in-the-parallel-region/m-p/952542#M5211</guid>
      <dc:creator>Vladimir_P_1234567890</dc:creator>
      <dc:date>2013-10-10T12:53:00Z</dc:date>
    </item>
    <item>
      <title>When you start a parallel</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/omp-get-thread-num-returns-random-values-in-the-parallel-region/m-p/952543#M5212</link>
      <description>&lt;P&gt;When you start a parallel region, it is like a horse race. IOW the hourses (threads) can run in any order. Thus you may see&lt;/P&gt;
&lt;P&gt;0,1,2,3&lt;BR /&gt;3,2,1,0&lt;BR /&gt;... (any permution of order)&lt;/P&gt;
&lt;P&gt;This is the whole idea of running in parallel (which does not mean lock-step).&lt;/P&gt;
&lt;P&gt;If you require sections of code to run in sequential order (e.g. writing results to output file), then consider having the parallel section produce the results into internal storage, then following the parallel section, have a sequential section that writes the results to output. You can do other more complicated things:&lt;/P&gt;
&lt;P&gt;[cpp]&lt;/P&gt;
&lt;P&gt;int nThread = omp_get_max_threads ();&lt;BR /&gt;ASSERT(nThread &amp;lt; MAX_THREADS_YOU_SPECIFY);&lt;BR /&gt;volatile int doneFlags[MAX_THREADS_YOU_SPECIFY];&lt;/P&gt;
&lt;P&gt;for(int i=0; i &amp;lt; nThread; ++i)&lt;BR /&gt;&amp;nbsp;doneFlags&lt;I&gt; = 0;&lt;/I&gt;&lt;/P&gt;
&lt;P&gt;#pragma omp parallel num_threads(nThread)&lt;BR /&gt;{&lt;BR /&gt;&amp;nbsp; int myID=omp_get_thread_num ();&lt;BR /&gt;&amp;nbsp; // partition work nThread-ways&lt;BR /&gt;&amp;nbsp; // assign me to myId partition&lt;BR /&gt;&amp;nbsp; doComputeWorkHere(nThread, myId); // in any order&lt;BR /&gt;&amp;nbsp; doneFlags[myID] = 1; // indicate myID is done&lt;BR /&gt;&amp;nbsp; if(myID == 0)&lt;BR /&gt;&amp;nbsp; {&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; for(int i=0; i &amp;lt; nThread; ++i)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; while(doneFlags&lt;I&gt; == 0)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _mm_pause();&lt;/I&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("Outputting Thread's ID %d data\n", i);&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; outputDataHere(i);&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; } // for&lt;BR /&gt;&amp;nbsp; } // if(myID == 0)&lt;BR /&gt;} // omp parallel&lt;BR /&gt;[/cpp]&lt;/P&gt;
&lt;P&gt;Note,&amp;nbsp;the above code example is not normal programming practice. Normal programming practice would divide the work evenly, then place the output section after the parallel region. This would also eliminate the need for the doneFlags and code initializing, setting, and testing.&lt;/P&gt;
&lt;P&gt;Now, you might ask why you would want to program in the manner as described above, and this would be a good question.&lt;/P&gt;
&lt;P&gt;a) You design your partitioning in a manner that each thread (myID) has different amount of work. The work is proportional to the myID number:&lt;/P&gt;
&lt;P&gt;work = X + Y * myID&lt;/P&gt;
&lt;P&gt;Thus when ID 0 finishes, it writes its data to file. While writing, IDs 1, 2, 3 continue working. If you set the Y correctly, ID 1 will finish work at the moment the write completes for ID 0. ID 2 finishes work at the moment the write for ID 1 finishes, ID 3 finishes at the moment the write for ID 2 finishes. Thusly, you can recover the latency of writing sections 0, 1, 2 (from a 4 thread asymectric work load).&lt;/P&gt;
&lt;P&gt;b) You can place a loop&amp;nbsp;in the parallel region. Then do something like this:&lt;/P&gt;
&lt;P&gt;[cpp]&lt;/P&gt;
&lt;P&gt;int nThread = omp_get_max_threads ();&lt;BR /&gt;volatile int writerID = 0;&lt;/P&gt;
&lt;P&gt;#pragma omp parallel num_threads(nThread)&lt;BR /&gt;{&lt;BR /&gt;&amp;nbsp; int myID=omp_get_thread_num ();&lt;BR /&gt;&amp;nbsp; for(int chunk = 0; chunk &amp;lt; nChunks; ++chunk)&lt;BR /&gt;&amp;nbsp; {&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; // partition work nThread-ways&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; // assign me to myId partition&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; doComputeWorkHere(nThread, myId, chunk); // in any order&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; while(writerFlag != myID)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _mm_pause(); // or Sleep(0)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("Outputting Thread's ID %d data\n", i);&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; outputDataHere(i);&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; writerFlag = (writerFlag + 1) % nThread;&lt;BR /&gt;&amp;nbsp; } // for&lt;BR /&gt;} // omp parallel&lt;BR /&gt;[/cpp]&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Thu, 10 Oct 2013 16:02:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/omp-get-thread-num-returns-random-values-in-the-parallel-region/m-p/952543#M5212</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2013-10-10T16:02:00Z</dc:date>
    </item>
  </channel>
</rss>

