Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
공지
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

Vectorize Fortran code(s)

happyIntelCamper
초급자
764 조회수
How would you get code like the following to vectorize. The problem is the indirect memory
references.

subroutine rayfund(n,inp,out,center,disp,amp)
integer n ! number of elements
real inp(n) ! input array
real out(n) ! output array
integer center(n) ! center point mapping input to output
integer disp(n) ! triangle displacement at a point
real amp(n) ! amp weighting
integer ind,indp,indn

do i = 1, n
ind = center(i)
indp = ind-disp(i)
indn = ind+disp(i)
out(i) = out(i)- amp(i)*(inp(indp)-2*inp(ind)+inp(indn))
enddo

end

0 포인트
6 응답
jimdempseyatthecove
명예로운 기여자 III
764 조회수

Try removing the temps
[cpp]do i = 1, n
  out(i) = out(i) - amp(i)*(inp(center(i)-disp(i))-2*inp(center(i))+inp(center(i)+disp(i)))
enddo

[/cpp]


Jim
0 포인트
happyIntelCamper
초급자
764 조회수

Try removing the temps
[cpp]do i = 1, n
  out(i) = out(i) - amp(i)*(inp(center(i)-disp(i))-2*inp(center(i))+inp(center(i)+disp(i)))
enddo

[/cpp]


Jim

Tried it and you still get:
ifort -O3 -c -vec_report3 fund.F90
fund.F90(12): (col. 29) remark: loop was not vectorized: subscript too complex.

On compilation of code frag.
0 포인트
TimP
명예로운 기여자 III
764 조회수

ifort -O3 -c -vec_report3 fund.F90

If you're running on an SSE4 CPU, you should have set -xSSE4.1 or -xhost. You have instructed the compiler not to use the instructions which perform scalar loads directly to a vector register, so you must expect no "vectorization."
Those aren't indirect memory references; they are simple non-unity strided references.
0 포인트
happyIntelCamper
초급자
764 조회수
Quoting - tim18
If you're running on an SSE4 CPU, you should have set -xSSE4.1 or -xhost. You have instructed the compiler not to use the instructions which perform scalar loads directly to a vector register, so you must expect no "vectorization."
Those aren't indirect memory references; they are simple non-unity strided references.
Tried -xS which generates SSE4 still the same issue.
0 포인트
Kevin_D_Intel
직원
764 조회수

Just guessing, but perhaps you are using a pre-11.1 compiler.

Both 11.0 and latest 11.1 vectorize as Tim suggested:

$ ifort -V -c -vec_report3 -xSSE4.1 u66952.f90
Intel Fortran Intel 64 Compiler Professional for applications running on Intel 64, Version 11.1 Build 20090511 Package ID: l_cprof_p_11.1.038
Copyright (C) 1985-2009 Intel Corporation. All rights reserved.

Intel Fortran 11.1-2492
u66952.f90(16): (col. 9) remark: LOOP WAS VECTORIZED.
0 포인트
happyIntelCamper
초급자
764 조회수

Just guessing, but perhaps you are using a pre-11.1 compiler.

Both 11.0 and latest 11.1 vectorize as Tim suggested:

$ ifort -V -c -vec_report3 -xSSE4.1 u66952.f90
Intel Fortran Intel 64 Compiler Professional for applications running on Intel 64, Version 11.1 Build 20090511 Package ID: l_cprof_p_11.1.038
Copyright (C) 1985-2009 Intel Corporation. All rights reserved.

Intel Fortran 11.1-2492
u66952.f90(16): (col. 9) remark: LOOP WAS VECTORIZED.

Yes that works!! Much THANKS!!!
0 포인트
응답