Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Vectorize Fortran code(s)

happyIntelCamper
Beginner
709 Views
How would you get code like the following to vectorize. The problem is the indirect memory
references.

subroutine rayfund(n,inp,out,center,disp,amp)
integer n ! number of elements
real inp(n) ! input array
real out(n) ! output array
integer center(n) ! center point mapping input to output
integer disp(n) ! triangle displacement at a point
real amp(n) ! amp weighting
integer ind,indp,indn

do i = 1, n
ind = center(i)
indp = ind-disp(i)
indn = ind+disp(i)
out(i) = out(i)- amp(i)*(inp(indp)-2*inp(ind)+inp(indn))
enddo

end

0 Kudos
6 Replies
jimdempseyatthecove
Honored Contributor III
709 Views

Try removing the temps
[cpp]do i = 1, n
  out(i) = out(i) - amp(i)*(inp(center(i)-disp(i))-2*inp(center(i))+inp(center(i)+disp(i)))
enddo

[/cpp]


Jim
0 Kudos
happyIntelCamper
Beginner
709 Views

Try removing the temps
[cpp]do i = 1, n
  out(i) = out(i) - amp(i)*(inp(center(i)-disp(i))-2*inp(center(i))+inp(center(i)+disp(i)))
enddo

[/cpp]


Jim

Tried it and you still get:
ifort -O3 -c -vec_report3 fund.F90
fund.F90(12): (col. 29) remark: loop was not vectorized: subscript too complex.

On compilation of code frag.
0 Kudos
TimP
Honored Contributor III
709 Views

ifort -O3 -c -vec_report3 fund.F90

If you're running on an SSE4 CPU, you should have set -xSSE4.1 or -xhost. You have instructed the compiler not to use the instructions which perform scalar loads directly to a vector register, so you must expect no "vectorization."
Those aren't indirect memory references; they are simple non-unity strided references.
0 Kudos
happyIntelCamper
Beginner
709 Views
Quoting - tim18
If you're running on an SSE4 CPU, you should have set -xSSE4.1 or -xhost. You have instructed the compiler not to use the instructions which perform scalar loads directly to a vector register, so you must expect no "vectorization."
Those aren't indirect memory references; they are simple non-unity strided references.
Tried -xS which generates SSE4 still the same issue.
0 Kudos
Kevin_D_Intel
Employee
709 Views

Just guessing, but perhaps you are using a pre-11.1 compiler.

Both 11.0 and latest 11.1 vectorize as Tim suggested:

$ ifort -V -c -vec_report3 -xSSE4.1 u66952.f90
Intel Fortran Intel 64 Compiler Professional for applications running on Intel 64, Version 11.1 Build 20090511 Package ID: l_cprof_p_11.1.038
Copyright (C) 1985-2009 Intel Corporation. All rights reserved.

Intel Fortran 11.1-2492
u66952.f90(16): (col. 9) remark: LOOP WAS VECTORIZED.
0 Kudos
happyIntelCamper
Beginner
709 Views

Just guessing, but perhaps you are using a pre-11.1 compiler.

Both 11.0 and latest 11.1 vectorize as Tim suggested:

$ ifort -V -c -vec_report3 -xSSE4.1 u66952.f90
Intel Fortran Intel 64 Compiler Professional for applications running on Intel 64, Version 11.1 Build 20090511 Package ID: l_cprof_p_11.1.038
Copyright (C) 1985-2009 Intel Corporation. All rights reserved.

Intel Fortran 11.1-2492
u66952.f90(16): (col. 9) remark: LOOP WAS VECTORIZED.

Yes that works!! Much THANKS!!!
0 Kudos
Reply