Loop Vectorization: "unsupported data type"

saratoga · ‎05-01-2010

Hi.

A while ago I asked about vectorizing with intel fortran, and was told that it did not support complex variables. I've decided to take another stab at it, and have rewritten a simple loop from my code without using any COMPLEX types.

My core loop looks like this:

DO 400 N=NMIN,NMAX

DV1N=M*DV1(N)

DV2N=DV2(N)

D11=DV1N*DV1NN

D12=DV1N*DV2NN

D21=DV2N*DV1NN

D22=DV2N*DV2NN

VVR=VVR+(

& (TR11(M1,N,NN)*D11 + TR21(M1,N,NN)*D21

& +TR12(M1,N,NN)*D12 + TR22(M1,N,NN)*D22)

& )

400 CONTINUE

|

But I get:

|

ampl.f(253): (col. 9) remark: vector dependence: assumed ANTI dependence between vvr line 253 and vvr line 253.

ampl.f(253): (col. 9) remark: vector dependence: assumed FLOW dependence between vvr line 253 and vvr line 253.

ampl.f(253): (col. 9) remark: vector dependence: assumed ANTI dependence between vvr line 253 and vvr line 253.

Also, TR11, TR12, TR21, TR22, VRR are all REAL *4, while the rest are all REAL *8 (but maybe could be changed if needed).

jimdempseyatthecove · ‎05-02-2010

The blend of real*4 and real*8 will affect the vectorization to some extent
However, the loop control variable N is an index that does not the index that represent the adjacent data. (That which can be vectorized)This would be the left most index. Try to rework the code such that the inner most loop operates on the left most index.

Either redefine the arrays to TRnn(N,M1,NN)
or
arrange your DO loops to nest on NN, N and M1

Jim Dempsey

TimP · ‎05-02-2010

Some cases of complex vectorization are supported with -msse3 (and Intel-only options which include SSE3).
As Jim indicated, you would have to arrange your data for stride 1 to support vectorization, at least with options short of SSE4, and you would need to avoid the mixed data types. Better support of vectorization with combined single and double precision is under development, but that will not eliminate the problem you have here. Also, the otherwise desirable option -fp-model source will suppress vectorization of sum reduction, in case you happen to be setting it. The message about dependencies isn't very helpful; it seems most likely to apply in the latter case.