Accumulating sum in a FORALL

Dishaw__Jim · ‎12-04-2006

I have the following piece of code that I would like to use (which does work but generates a compiler warning):

FORALL (ix=1:nCells)
FORALL (n=1:nMax) phiA(ix) = phiA(ix) + w(n) * psi(n,ix)
END FORALL

The warning is

All active cominations of index-names are not used within the variable being defined
(i.e. leftside) of this assignment-stmt. [PHIA]

The inner FORALL breaks the "No element of an array can be assigned a value more than once" rule. From what I understand, I must use DO loops in order to do the accumulating sum properly.

In almost all cases, nCells > nMas x (the only exceptions occur with fake debugging cases). The value of nMax is typically less than 100 and nCells can be in the thousands. What is the "best" way to code an accumulating sum that will enable the optimizer to maximize performance?

Steven_L_Intel1 · ‎12-04-2006

FORALL can be thought of, conceptually, as a "parallel DO". The rules are set up so that there is no interaction between executions of the FORALL body. The way you coded it, there is a dependency among all the invocations in the inner FORALL.

Your best bet is to code it simply in a DO loop. This will give the compiler the best opportunity to vectorize and otherwise optimize it.

TimP · ‎12-05-2006

It looks as if you could replace the inner loop with DOT_PRODUCT(). Then, it may make little difference whether the outer loop is DO or FORALL, except that OpenMP parallel do would be available for use with DO. Why not use MATMUL, or ?GEMM from one of the optimized BLAS libraries such as MKL?

Dishaw__Jim · ‎12-05-2006

I didn't even think about the BLAS call--thanks for the reminder