<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Note to self: this question in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Vectorization-with-SIMD-enabled-functions-works-from-functions/m-p/1047551#M6786</link>
    <description>&lt;P&gt;Note to self: this question was answered in the Intel C++ Compiler forum: &lt;A href="https://software.intel.com/en-us/forums/intel-c-compiler/topic/599705" target="_blank"&gt;https://software.intel.com/en-us/forums/intel-c-compiler/topic/599705&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Bottom line: it is a compiler bug, and a work-around is available.&lt;/P&gt;</description>
    <pubDate>Wed, 11 Nov 2015 19:42:15 GMT</pubDate>
    <dc:creator>Andrey_Vladimirov</dc:creator>
    <dc:date>2015-11-11T19:42:15Z</dc:date>
    <item>
      <title>Vectorization with SIMD-enabled functions works from functions, not from main()</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Vectorization-with-SIMD-enabled-functions-works-from-functions/m-p/1047550#M6785</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I have run into a situation that I cannot explain. I have a loop with a SIMD-enabled function and I use #pragma simd before it. This loop vectorizes if it is placed in a separate function, but does not vectorize if it is inside main(). I am using Intel C++ compiler 16.0.0.109. Please see code and vectorization reports below. Can anyone explain what is happening and if there is a way to work around this?&lt;/P&gt;

&lt;P&gt;This is loop-in-main.cc:&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;__attribute__((vector)) void SimdEnabledFunction(double);

int main() {
  int n = 10000;
  double a&lt;N&gt;;
#pragma simd
  for(int i = 0 ; i &amp;lt; n ; i++)
      SimdEnabledFunction(a&lt;I&gt;);
}&lt;/I&gt;&lt;/N&gt;&lt;/PRE&gt;

&lt;P&gt;This is the optimization report for it (loop does not vectorize):&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;[avladim@cfx-0 ~]$ icpc -qopenmp -c -qopt-report -qopt-report-stdout loop-in-main.cc
Intel(R) Advisor can now assist with vectorization and show optimization
  report messages with your source code.
See "https://software.intel.com/en-us/intel-advisor-xe" for details.


    Report from: Interprocedural optimizations [ipo]

INLINING OPTION VALUES:
  -inline-factor: 100
  -inline-min-size: 30
  -inline-max-size: 230
  -inline-max-total-size: 2000
  -inline-max-per-routine: 10000
  -inline-max-per-compile: 500000


Begin optimization report for: main()

    Report from: Interprocedural optimizations [ipo]

INLINE REPORT: (main()) [1] loop-in-main.cc(3,12)

loop-in-main.cc(7): (col. 3) warning #13379: loop was not vectorized with "simd"

    Report from: Loop nest, Vector &amp;amp; Auto-parallelization optimizations [loop, vec, par]


LOOP BEGIN at loop-in-main.cc(7,3)
   remark #15520: simd loop was not vectorized: loop with multiple exits cannot be vectorized unless it meets search loop idiom criteria
   remark #13379: loop was not vectorized with "simd"
LOOP END
===========================================================================
[avladim@cfx-0 ~]$ 
&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;This is the other code, loop-in-func.cc, where the loop is in a separate function:&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;__attribute__((vector)) void SimdEnabledFunction(double);

void UserFunction(int n, double* a) {
#pragma simd
  for(int i = 0 ; i &amp;lt; n ; i++)
      SimdEnabledFunction(a&lt;I&gt;);
}

int main() {
  int n = 10000;
  double a&lt;N&gt;;
  UserFunction(n, a);
}
&lt;/N&gt;&lt;/I&gt;&lt;/PRE&gt;

&lt;P&gt;This is the optimization report for it (SIMD LOOP WAS VECTORIZED):&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;[avladim@cfx-0 ~]$ icpc -qopenmp -c -qopt-report -qopt-report-stdout loop-in-func.cc
Intel(R) Advisor can now assist with vectorization and show optimization
  report messages with your source code.
See "https://software.intel.com/en-us/intel-advisor-xe" for details.


    Report from: Interprocedural optimizations [ipo]

INLINING OPTION VALUES:
  -inline-factor: 100
  -inline-min-size: 30
  -inline-max-size: 230
  -inline-max-total-size: 2000
  -inline-max-per-routine: 10000
  -inline-max-per-compile: 500000


Begin optimization report for: main()

    Report from: Interprocedural optimizations [ipo]

INLINE REPORT: (main()) [1] loop-in-func.cc(9,12)
  -&amp;gt; INLINE: (12,3) UserFunction(int, double *)

loop-in-func.cc(5): (col. 3) warning #13379: loop was not vectorized with "simd"

    Report from: Loop nest, Vector &amp;amp; Auto-parallelization optimizations [loop, vec, par]


LOOP BEGIN at loop-in-func.cc(5,3) inlined into loop-in-func.cc(12,3)
   remark #15520: simd loop was not vectorized: loop with multiple exits cannot be vectorized unless it meets search loop idiom criteria
   remark #13379: loop was not vectorized with "simd"
LOOP END
===========================================================================

Begin optimization report for: UserFunction(int, double *)

    Report from: Interprocedural optimizations [ipo]

INLINE REPORT: (UserFunction(int, double *)) [2] loop-in-func.cc(3,37)


    Report from: Loop nest, Vector &amp;amp; Auto-parallelization optimizations [loop, vec, par]


LOOP BEGIN at loop-in-func.cc(5,3)
&amp;lt;Peeled loop for vectorization&amp;gt;
LOOP END

LOOP BEGIN at loop-in-func.cc(5,3)
   remark #15301: SIMD LOOP WAS VECTORIZED
LOOP END

LOOP BEGIN at loop-in-func.cc(5,3)
&amp;lt;Remainder loop for vectorization&amp;gt;
   remark #15335: remainder loop was not vectorized: vectorization possible but seems inefficient. Use vector always directive or -vec-threshold0 to override 
LOOP END
===========================================================================
[avladim@cfx-0 ~]$&lt;/PRE&gt;

&lt;P&gt;Andrey&lt;/P&gt;</description>
      <pubDate>Wed, 04 Nov 2015 19:29:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Vectorization-with-SIMD-enabled-functions-works-from-functions/m-p/1047550#M6785</guid>
      <dc:creator>Andrey_Vladimirov</dc:creator>
      <dc:date>2015-11-04T19:29:46Z</dc:date>
    </item>
    <item>
      <title>Note to self: this question</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Vectorization-with-SIMD-enabled-functions-works-from-functions/m-p/1047551#M6786</link>
      <description>&lt;P&gt;Note to self: this question was answered in the Intel C++ Compiler forum: &lt;A href="https://software.intel.com/en-us/forums/intel-c-compiler/topic/599705" target="_blank"&gt;https://software.intel.com/en-us/forums/intel-c-compiler/topic/599705&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Bottom line: it is a compiler bug, and a work-around is available.&lt;/P&gt;</description>
      <pubDate>Wed, 11 Nov 2015 19:42:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Vectorization-with-SIMD-enabled-functions-works-from-functions/m-p/1047551#M6786</guid>
      <dc:creator>Andrey_Vladimirov</dc:creator>
      <dc:date>2015-11-11T19:42:15Z</dc:date>
    </item>
  </channel>
</rss>

