- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have run into a situation that I cannot explain. I have a loop with a SIMD-enabled function and I use #pragma simd before it. This loop vectorizes if it is placed in a separate function, but does not vectorize if it is inside main(). I am using Intel C++ compiler 16.0.0.109. Please see code and vectorization reports below. Can anyone explain what is happening and if there is a way to work around this?
This is loop-in-main.cc:
__attribute__((vector)) void SimdEnabledFunction(double); int main() { int n = 10000; double a; #pragma simd for(int i = 0 ; i < n ; i++) SimdEnabledFunction(a); }
This is the optimization report for it (loop does not vectorize):
[avladim@cfx-0 ~]$ icpc -qopenmp -c -qopt-report -qopt-report-stdout loop-in-main.cc Intel(R) Advisor can now assist with vectorization and show optimization report messages with your source code. See "https://software.intel.com/en-us/intel-advisor-xe" for details. Report from: Interprocedural optimizations [ipo] INLINING OPTION VALUES: -inline-factor: 100 -inline-min-size: 30 -inline-max-size: 230 -inline-max-total-size: 2000 -inline-max-per-routine: 10000 -inline-max-per-compile: 500000 Begin optimization report for: main() Report from: Interprocedural optimizations [ipo] INLINE REPORT: (main()) [1] loop-in-main.cc(3,12) loop-in-main.cc(7): (col. 3) warning #13379: loop was not vectorized with "simd" Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par] LOOP BEGIN at loop-in-main.cc(7,3) remark #15520: simd loop was not vectorized: loop with multiple exits cannot be vectorized unless it meets search loop idiom criteria remark #13379: loop was not vectorized with "simd" LOOP END =========================================================================== [avladim@cfx-0 ~]$
This is the other code, loop-in-func.cc, where the loop is in a separate function:
__attribute__((vector)) void SimdEnabledFunction(double); void UserFunction(int n, double* a) { #pragma simd for(int i = 0 ; i < n ; i++) SimdEnabledFunction(a); } int main() { int n = 10000; double a; UserFunction(n, a); }
This is the optimization report for it (SIMD LOOP WAS VECTORIZED):
[avladim@cfx-0 ~]$ icpc -qopenmp -c -qopt-report -qopt-report-stdout loop-in-func.cc Intel(R) Advisor can now assist with vectorization and show optimization report messages with your source code. See "https://software.intel.com/en-us/intel-advisor-xe" for details. Report from: Interprocedural optimizations [ipo] INLINING OPTION VALUES: -inline-factor: 100 -inline-min-size: 30 -inline-max-size: 230 -inline-max-total-size: 2000 -inline-max-per-routine: 10000 -inline-max-per-compile: 500000 Begin optimization report for: main() Report from: Interprocedural optimizations [ipo] INLINE REPORT: (main()) [1] loop-in-func.cc(9,12) -> INLINE: (12,3) UserFunction(int, double *) loop-in-func.cc(5): (col. 3) warning #13379: loop was not vectorized with "simd" Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par] LOOP BEGIN at loop-in-func.cc(5,3) inlined into loop-in-func.cc(12,3) remark #15520: simd loop was not vectorized: loop with multiple exits cannot be vectorized unless it meets search loop idiom criteria remark #13379: loop was not vectorized with "simd" LOOP END =========================================================================== Begin optimization report for: UserFunction(int, double *) Report from: Interprocedural optimizations [ipo] INLINE REPORT: (UserFunction(int, double *)) [2] loop-in-func.cc(3,37) Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par] LOOP BEGIN at loop-in-func.cc(5,3) <Peeled loop for vectorization> LOOP END LOOP BEGIN at loop-in-func.cc(5,3) remark #15301: SIMD LOOP WAS VECTORIZED LOOP END LOOP BEGIN at loop-in-func.cc(5,3) <Remainder loop for vectorization> remark #15335: remainder loop was not vectorized: vectorization possible but seems inefficient. Use vector always directive or -vec-threshold0 to override LOOP END =========================================================================== [avladim@cfx-0 ~]$
Andrey
- Tags:
- Parallel Computing
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Note to self: this question was answered in the Intel C++ Compiler forum: https://software.intel.com/en-us/forums/intel-c-compiler/topic/599705
Bottom line: it is a compiler bug, and a work-around is available.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page