<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How can I parallelize implicit loop ? in Software Archive</title>
    <link>https://community.intel.com/t5/Software-Archive/How-can-I-parallelize-implicit-loop/m-p/979407#M25696</link>
    <description>&lt;PRE class="brush:cpp;"&gt;
I have the loop, inside its body running the function with array member (dependent on loop index) as an argument, and returning one value.
I can parallelized this loop by using cilk_for() operator instead of regular for() - and it is simple and works well.&amp;nbsp; This is explicit parallelization.&amp;nbsp; 
Instead of explicit loop instruction I can use Array Notation contruction (as shown below) - it is implicit loop.
My routine is relatively long and complecs, and has Array Notation constructions inside, so it cannot be declared as a vector (elemental) one.
When I use implicit loop - it is not parallelized, the run time is increased substantially.
&lt;!--break--&gt; 
float foo(float f_in)
{
&amp;nbsp;float f_result;
&amp;nbsp;// LONG computation containing CILK+ Array Notation operations

&amp;nbsp;/////////////////////////////////////////////////////////
&amp;nbsp;return f_result;
}

int main()
{
&amp;nbsp;float af_in&lt;N&gt;, af_out&lt;N&gt;;

// Explicit parallelized loop
&amp;nbsp;cilk_for(int i=0; i&amp;lt;n; i++)
&amp;nbsp; af_out&lt;I&gt; =&amp;nbsp; foo(af_in&lt;I&gt;);

// Implicit non-parallelized loop
&amp;nbsp;af_out[:] =&amp;nbsp; foo(af_in[:]);
}

My question is: does somebody know, if there is the way "to say" to compiler, that my implicit loop (Array Notation assignment) has independent steps and should be parallelized (pragma, something else) ?



&lt;/I&gt;&lt;/I&gt;&lt;/N&gt;&lt;/N&gt;&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sun, 09 Feb 2014 12:50:37 GMT</pubDate>
    <dc:creator>Zvi_D_Intel</dc:creator>
    <dc:date>2014-02-09T12:50:37Z</dc:date>
    <item>
      <title>How can I parallelize implicit loop ?</title>
      <link>https://community.intel.com/t5/Software-Archive/How-can-I-parallelize-implicit-loop/m-p/979407#M25696</link>
      <description>&lt;PRE class="brush:cpp;"&gt;
I have the loop, inside its body running the function with array member (dependent on loop index) as an argument, and returning one value.
I can parallelized this loop by using cilk_for() operator instead of regular for() - and it is simple and works well.&amp;nbsp; This is explicit parallelization.&amp;nbsp; 
Instead of explicit loop instruction I can use Array Notation contruction (as shown below) - it is implicit loop.
My routine is relatively long and complecs, and has Array Notation constructions inside, so it cannot be declared as a vector (elemental) one.
When I use implicit loop - it is not parallelized, the run time is increased substantially.
&lt;!--break--&gt; 
float foo(float f_in)
{
&amp;nbsp;float f_result;
&amp;nbsp;// LONG computation containing CILK+ Array Notation operations

&amp;nbsp;/////////////////////////////////////////////////////////
&amp;nbsp;return f_result;
}

int main()
{
&amp;nbsp;float af_in&lt;N&gt;, af_out&lt;N&gt;;

// Explicit parallelized loop
&amp;nbsp;cilk_for(int i=0; i&amp;lt;n; i++)
&amp;nbsp; af_out&lt;I&gt; =&amp;nbsp; foo(af_in&lt;I&gt;);

// Implicit non-parallelized loop
&amp;nbsp;af_out[:] =&amp;nbsp; foo(af_in[:]);
}

My question is: does somebody know, if there is the way "to say" to compiler, that my implicit loop (Array Notation assignment) has independent steps and should be parallelized (pragma, something else) ?



&lt;/I&gt;&lt;/I&gt;&lt;/N&gt;&lt;/N&gt;&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 09 Feb 2014 12:50:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/How-can-I-parallelize-implicit-loop/m-p/979407#M25696</guid>
      <dc:creator>Zvi_D_Intel</dc:creator>
      <dc:date>2014-02-09T12:50:37Z</dc:date>
    </item>
    <item>
      <title>Have you tried #pragma simd?</title>
      <link>https://community.intel.com/t5/Software-Archive/How-can-I-parallelize-implicit-loop/m-p/979408#M25697</link>
      <description>&lt;P&gt;Have you tried &lt;A href="https://www.cilkplus.org/tutorial-pragma-simd" target="_blank"&gt;#pragma simd&lt;/A&gt;? &amp;nbsp;Essentially that tells the compiler that the loop should be vectorized, even if the auto vectorization fails.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp;- Barry&lt;/P&gt;</description>
      <pubDate>Tue, 11 Feb 2014 14:28:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/How-can-I-parallelize-implicit-loop/m-p/979408#M25697</guid>
      <dc:creator>Barry_T_Intel</dc:creator>
      <dc:date>2014-02-11T14:28:55Z</dc:date>
    </item>
  </channel>
</rss>

