<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Threaded Programs with Built-in Load-Imbalance in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Threaded-Programs-with-Built-in-Load-Imbalance/m-p/995799#M6360</link>
    <description>&lt;DIV&gt;Hi Avsha,&lt;/DIV&gt;
&lt;DIV&gt;I don't know of any specific programs or benchmarks that contain the kind of load imbalance that you need. You might try &lt;A href="http://www.spec.org/omp/" target="_blank"&gt;SPEC OMP&lt;/A&gt; but I don't know if any of the applications in this benchmark have a load imbalance.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;It shouldn't be too difficult to write simple benchmarks that do what you need. As you already mentioned, in a functional decomposition, the following parallel region contains an inherent load imbalance:&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt;#pragma omp parallel sections&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt;{&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; #pragma omp section&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; small_func ();&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; #pragma omp section&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; big_func ();&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt;}&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;The following parallel loop can also defeat dynamic scheduling:&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt;#pragma omp parallel for&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; for (i = 0; i &amp;lt; N; i++)&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; {&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; small_func ();&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; if (i== (N-1)) big_func ();&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; }&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;What are you trying to demonstrate?&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Best regards,&lt;/DIV&gt;
&lt;DIV&gt;Henry&lt;/DIV&gt;</description>
    <pubDate>Wed, 23 Feb 2005 06:53:12 GMT</pubDate>
    <dc:creator>Henry_G_Intel</dc:creator>
    <dc:date>2005-02-23T06:53:12Z</dc:date>
    <item>
      <title>Threaded Programs with Built-in Load-Imbalance</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Threaded-Programs-with-Built-in-Load-Imbalance/m-p/995798#M6359</link>
      <description>&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Hi,&lt;/DIV&gt;
&lt;DIV&gt;I am looking for cases where there is built-in load imbalance between threads (unsolvable by such techniques as dynamic allocation).&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Can anyone point me to programs/benchmarks that have this type of behavior?&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV&gt;(Functional or data-pipelined decomposition could have examples of this. In these decompositions, each thread is doing different work at certain parallel regions, so one thread typically finishes its work for that region before for the other.)&lt;/DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Thanks,&lt;/DIV&gt;
&lt;DIV&gt;Avsha.&lt;/DIV&gt;</description>
      <pubDate>Thu, 10 Feb 2005 18:51:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Threaded-Programs-with-Built-in-Load-Imbalance/m-p/995798#M6359</guid>
      <dc:creator>avsha</dc:creator>
      <dc:date>2005-02-10T18:51:37Z</dc:date>
    </item>
    <item>
      <title>Re: Threaded Programs with Built-in Load-Imbalance</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Threaded-Programs-with-Built-in-Load-Imbalance/m-p/995799#M6360</link>
      <description>&lt;DIV&gt;Hi Avsha,&lt;/DIV&gt;
&lt;DIV&gt;I don't know of any specific programs or benchmarks that contain the kind of load imbalance that you need. You might try &lt;A href="http://www.spec.org/omp/" target="_blank"&gt;SPEC OMP&lt;/A&gt; but I don't know if any of the applications in this benchmark have a load imbalance.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;It shouldn't be too difficult to write simple benchmarks that do what you need. As you already mentioned, in a functional decomposition, the following parallel region contains an inherent load imbalance:&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt;#pragma omp parallel sections&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt;{&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; #pragma omp section&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; small_func ();&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; #pragma omp section&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; big_func ();&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt;}&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;The following parallel loop can also defeat dynamic scheduling:&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt;#pragma omp parallel for&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; for (i = 0; i &amp;lt; N; i++)&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; {&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; small_func ();&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; if (i== (N-1)) big_func ();&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT face="Courier New"&gt; }&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;What are you trying to demonstrate?&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Best regards,&lt;/DIV&gt;
&lt;DIV&gt;Henry&lt;/DIV&gt;</description>
      <pubDate>Wed, 23 Feb 2005 06:53:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Threaded-Programs-with-Built-in-Load-Imbalance/m-p/995799#M6360</guid>
      <dc:creator>Henry_G_Intel</dc:creator>
      <dc:date>2005-02-23T06:53:12Z</dc:date>
    </item>
    <item>
      <title>Re: Threaded Programs with Built-in Load-Imbalance</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Threaded-Programs-with-Built-in-Load-Imbalance/m-p/995800#M6361</link>
      <description>&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Avsha -&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Anything that uses OpenMP sections would be a good candidate for what you're looking for. For example,assume there are several computations (function calls) that have some dependencies as far as order of execution, but the entire sequence can be divided into two or more independent segments. Then,the set of independent segments canbe parallelized with a sections pragma, butthe work load between segments may be unbalanced.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;I've recently looked at a game code that had this feature. Six function calls are used to update positions of objects in motion and how those new positions affect the enviroment and what is drawn in the next frame displayed. The first four functions could be done in one thread while the sixth function relied on the results of the fifth function, so the last two were put into a section together. Performance analysis via Thread Profiler showed the time spent in the final two functions took almost 30X longer than the total time for the first four function calls. No easy way to redistribute the work here for a better load balance.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;--clay&lt;/DIV&gt;</description>
      <pubDate>Fri, 25 Feb 2005 23:41:08 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Threaded-Programs-with-Built-in-Load-Imbalance/m-p/995800#M6361</guid>
      <dc:creator>ClayB</dc:creator>
      <dc:date>2005-02-25T23:41:08Z</dc:date>
    </item>
  </channel>
</rss>

