Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29277 Discussions

Vectorization reporting bug / regression

Brian1
Beginner
576 Views

There is a substantial change in vectorization reporting with ifort 13.1 vs 12.1 on the two operating systems I have access to, Linux and Mac OS X.  I assume this is a bug / regression, rather than the intended behavior.

With 12.1:

[plain]

host% ifort --version
ifort (IFORT) 12.1.3 20120212
Copyright (C) 1985-2012 Intel Corporation. All rights reserved.

host% ifort -O3 -xAVX -vec-report1 vec_report_bug.F90 -c
vec_report_bug.F90(31): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(32): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(33): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(34): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(35): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(37): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(38): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(39): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(40): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(42): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(44): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(45): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(46): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(47): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(49): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(51): (col. 1) remark: LOOP WAS VECTORIZED.

[/plain]

With 13.1:

[plain]

host% ifort --version
ifort (IFORT) 13.1.3 20130607
Copyright (C) 1985-2013 Intel Corporation. All rights reserved.

host% ifort -O3 -xAVX -vec-report1 vec_report_bug.F90 -c
vec_report_bug.F90(31): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(31): (col. 1) remark: LOOP WAS VECTORIZED.

[/plain]

Examination of the assembly code shows that both 12.1 and 13.1 are generating vectorized code for all of the lines reported by 12.1.

0 Kudos
3 Replies
Steven_L_Intel1
Employee
576 Views

Thanks - we'll take a look. Escalated as issue DPD200247457.

0 Kudos
Steven_L_Intel1
Employee
576 Views

The developers tell me that this is not a bug. What has happened instead is that the 13.1 compiler aggresively fuses the multiple loops, where the 12.1 compiler didn't. If you ask for an optimization report, you see something like this:

[plain]
High Level Optimizer Report (_WENO6N)

Fusion loop partitions: (loop line numbers)

Fused Loops: ( 49 51 )
Fused Loops: ( 47 49 )
Fused Loops: ( 46 47 )
Fused Loops: ( 45 46 )
Fused Loops: ( 44 45 )
Fused Loops: ( 42 44 )
Fused Loops: ( 40 42 )
Fused Loops: ( 39 40 )
Fused Loops: ( 38 39 )
Fused Loops: ( 37 38 )
Fused Loops: ( 35 37 )
Fused Loops: ( 34 35 )
Fused Loops: ( 33 34 )
Fused Loops: ( 32 33 )
Fused Loops: ( 31 32 )
[/plain]

and then a "loop distribution" optimization splits it up. In a future release we have plans to better integrate the vectorization report to make this more understandable.

0 Kudos
TimP
Honored Contributor III
576 Views

if you wish  to prevent fusion at some loop boundaries you can set !dir  no fusion.   by fusing some  loops the compiler should be able to approach full performance at smaller loop counts with less unrolling provided there are no store to reload misalignment.

0 Kudos
Reply