Vectorization reporting bug / regression

Brian1 · ‎08-26-2013

There is a substantial change in vectorization reporting with ifort 13.1 vs 12.1 on the two operating systems I have access to, Linux and Mac OS X. I assume this is a bug / regression, rather than the intended behavior.

With 12.1:

[plain]

host% ifort --version
ifort (IFORT) 12.1.3 20120212
Copyright (C) 1985-2012 Intel Corporation. All rights reserved.

host% ifort -O3 -xAVX -vec-report1 vec_report_bug.F90 -c
vec_report_bug.F90(31): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(32): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(33): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(34): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(35): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(37): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(38): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(39): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(40): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(42): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(44): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(45): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(46): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(47): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(49): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(51): (col. 1) remark: LOOP WAS VECTORIZED.

[/plain]

With 13.1:

[plain]

host% ifort --version
ifort (IFORT) 13.1.3 20130607
Copyright (C) 1985-2013 Intel Corporation. All rights reserved.

host% ifort -O3 -xAVX -vec-report1 vec_report_bug.F90 -c
vec_report_bug.F90(31): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(31): (col. 1) remark: LOOP WAS VECTORIZED.

[/plain]

Examination of the assembly code shows that both 12.1 and 13.1 are generating vectorized code for all of the lines reported by 12.1.

Steven_L_Intel1 · ‎08-26-2013

Thanks - we'll take a look. Escalated as issue DPD200247457.

Steven_L_Intel1 · ‎08-26-2013

The developers tell me that this is not a bug. What has happened instead is that the 13.1 compiler aggresively fuses the multiple loops, where the 12.1 compiler didn't. If you ask for an optimization report, you see something like this:

[plain]
High Level Optimizer Report (_WENO6N)

Fusion loop partitions: (loop line numbers)

Fused Loops: ( 49 51 )
Fused Loops: ( 47 49 )
Fused Loops: ( 46 47 )
Fused Loops: ( 45 46 )
Fused Loops: ( 44 45 )
Fused Loops: ( 42 44 )
Fused Loops: ( 40 42 )
Fused Loops: ( 39 40 )
Fused Loops: ( 38 39 )
Fused Loops: ( 37 38 )
Fused Loops: ( 35 37 )
Fused Loops: ( 34 35 )
Fused Loops: ( 33 34 )
Fused Loops: ( 32 33 )
Fused Loops: ( 31 32 )
[/plain]

and then a "loop distribution" optimization splits it up. In a future release we have plans to better integrate the vectorization report to make this more understandable.

TimP · ‎08-26-2013

if you wish to prevent fusion at some loop boundaries you can set !dir no fusion. by fusing some loops the compiler should be able to approach full performance at smaller loop counts with less unrolling provided there are no store to reload misalignment.