Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28456 Discussions

Unrolling problem in Intel Fortran v14.0.2.144 (x86_64/EM64T)

Victor_V_
Beginner
319 Views

Hi,

 

I will report a problem with unrolling of a very simple loop:

 

DO 200 I = 1, NDIM

VEC(I) = A(I*(I+1)/2)

200 CONTINUE

 

Something goes wrong when the 'A' and 'VEC' arrays/pointers refer to the same memory location and the memory address itself is not aligned on 16-byte boundary. When this piece of code is compiled with '-O2' we always get wrong results for the the first [2:k] elements of VEC, where k depends on certain value of 'NDIM'. However, we got always correct results if unrolling is completely disabled either by passing the '-unroll=0' option to compiler or by using the DEC attribute:

 

cDEC$ NOUNROLL

DO 200 I = 1, NDIM

VEC(I) = A(I*(I+1)/2)

200 CONTINUE

 

Just to show the problem, below is given output from our real test:

 

#1

COPDIA NDIM: 67

ADRESSES :38897040 38897040

ADDRESSES ARE NOT ALIGNED ON 8-BYTE BOUNDARY

2.258586282289821E-004 2.258586282289821E-004 T

1.129034497633417E-003 1.790368778501193E-003 F

1.790368778501193E-003 2.945284711568669E-003 F

1.873467396476006E-003 2.331244172747492E-002 F

2.618478092336402E-003 4.152520189742853E-002 F

2.945284711568669E-003 6.745867362432861E-002 F

1.335999185554419E-002 0.114438101565396          F

1.345512252979789E-002 0.294939942339324          F

1.676828907801860E-002 0.644754785440931          F

2.331244172747492E-002 1.15038004433901            F

3.037574582317423E-002 1.84400534002079            F

3.208721233146836E-002 3.208721233146836E-002 T

....

#2

COPDIA NDIM: 27

ADRESSES :37716080 37716080

ADDRESSES ARE NOT ALIGNED ON 8-BYTE BOUNDARY

3.769405624512559E-002 3.769405624512559E-002 T

8.894699416983025E-002 9.866743377462182E-002 F

9.866743377462182E-002 0.292273738386099          F

0.231468604273682 0.384206422411758                   F

0.247144738198662 0.566242246806149                   F

0.292273738386099 1.42326482050238                     F

0.303953189347391 0.303953189347391                   T

....

 

here, we are comparing the 'VEC' arrays resulted from the optimized (unrolled) and modified (non-unrolled) versions of the loop; ADRESSES correspond to memory locations of 'A', and 'VEC' arrays.

 

Enclosed please find a code snippet used for our testing purposes as well as output resulted from a real test. Since  our is project is a quite big one, we cannot provide the whole source code.

 

With best regards,

Victor.

0 Kudos
2 Replies
TimP
Honored Contributor III
319 Views

As your source code violates the Fortran standard, you must set -assume dummy_aliases if you hope to have this work in any reasonable way.  As your attached source code doesn't compile, and you didn't mention this point, I can't verify whether that makes the difference.

0 Kudos
Victor_V_
Beginner
319 Views

Hi Tim,

thank you very much for your expertise and hint suggested! Indeed, passing the '-assume dummy_aliases' option to compiler solves the problem. Moreover, allocating and using a temporary array also works out. I just will mention that Intel v13 as well as other Fortran's compilers produce correct results, regardless of arguments aliasing.

With best regards,

Victor.

0 Kudos
Reply