- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have run into a situation that I cannot explain. I have a loop with a SIMD-enabled function and I use #pragma simd before it. This loop vectorizes if it is placed in a separate function, but does not vectorize if it is inside main(). I am using Intel C++ compiler 16.0.0.109. Please see code and vectorization reports below. Can anyone explain what is happening and if there is a way to work around this?
This is loop-in-main.cc:
__attribute__((vector)) void SimdEnabledFunction(double);
int main() {
int n = 10000;
double a;
#pragma simd
for(int i = 0 ; i < n ; i++)
SimdEnabledFunction(a);
}
This is the optimization report for it (loop does not vectorize):
[avladim@cfx-0 ~]$ icpc -qopenmp -c -qopt-report -qopt-report-stdout loop-in-main.cc
Intel(R) Advisor can now assist with vectorization and show optimization
report messages with your source code.
See "https://software.intel.com/en-us/intel-advisor-xe" for details.
Report from: Interprocedural optimizations [ipo]
INLINING OPTION VALUES:
-inline-factor: 100
-inline-min-size: 30
-inline-max-size: 230
-inline-max-total-size: 2000
-inline-max-per-routine: 10000
-inline-max-per-compile: 500000
Begin optimization report for: main()
Report from: Interprocedural optimizations [ipo]
INLINE REPORT: (main()) [1] loop-in-main.cc(3,12)
loop-in-main.cc(7): (col. 3) warning #13379: loop was not vectorized with "simd"
Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par]
LOOP BEGIN at loop-in-main.cc(7,3)
remark #15520: simd loop was not vectorized: loop with multiple exits cannot be vectorized unless it meets search loop idiom criteria
remark #13379: loop was not vectorized with "simd"
LOOP END
===========================================================================
[avladim@cfx-0 ~]$
This is the other code, loop-in-func.cc, where the loop is in a separate function:
__attribute__((vector)) void SimdEnabledFunction(double);
void UserFunction(int n, double* a) {
#pragma simd
for(int i = 0 ; i < n ; i++)
SimdEnabledFunction(a);
}
int main() {
int n = 10000;
double a;
UserFunction(n, a);
}
This is the optimization report for it (SIMD LOOP WAS VECTORIZED):
[avladim@cfx-0 ~]$ icpc -qopenmp -c -qopt-report -qopt-report-stdout loop-in-func.cc
Intel(R) Advisor can now assist with vectorization and show optimization
report messages with your source code.
See "https://software.intel.com/en-us/intel-advisor-xe" for details.
Report from: Interprocedural optimizations [ipo]
INLINING OPTION VALUES:
-inline-factor: 100
-inline-min-size: 30
-inline-max-size: 230
-inline-max-total-size: 2000
-inline-max-per-routine: 10000
-inline-max-per-compile: 500000
Begin optimization report for: main()
Report from: Interprocedural optimizations [ipo]
INLINE REPORT: (main()) [1] loop-in-func.cc(9,12)
-> INLINE: (12,3) UserFunction(int, double *)
loop-in-func.cc(5): (col. 3) warning #13379: loop was not vectorized with "simd"
Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par]
LOOP BEGIN at loop-in-func.cc(5,3) inlined into loop-in-func.cc(12,3)
remark #15520: simd loop was not vectorized: loop with multiple exits cannot be vectorized unless it meets search loop idiom criteria
remark #13379: loop was not vectorized with "simd"
LOOP END
===========================================================================
Begin optimization report for: UserFunction(int, double *)
Report from: Interprocedural optimizations [ipo]
INLINE REPORT: (UserFunction(int, double *)) [2] loop-in-func.cc(3,37)
Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par]
LOOP BEGIN at loop-in-func.cc(5,3)
<Peeled loop for vectorization>
LOOP END
LOOP BEGIN at loop-in-func.cc(5,3)
remark #15301: SIMD LOOP WAS VECTORIZED
LOOP END
LOOP BEGIN at loop-in-func.cc(5,3)
<Remainder loop for vectorization>
remark #15335: remainder loop was not vectorized: vectorization possible but seems inefficient. Use vector always directive or -vec-threshold0 to override
LOOP END
===========================================================================
[avladim@cfx-0 ~]$
Andrey
- Tags:
- Parallel Computing
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Note to self: this question was answered in the Intel C++ Compiler forum: https://software.intel.com/en-us/forums/intel-c-compiler/topic/599705
Bottom line: it is a compiler bug, and a work-around is available.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page