Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Accumulating sum in a FORALL

Dishaw__Jim
Beginner
1,945 Views
I have the following piece of code that I would like to use (which does work but generates a compiler warning):

FORALL (ix=1:nCells)
FORALL (n=1:nMax) phiA(ix) = phiA(ix) + w(n) * psi(n,ix)
END FORALL

The warning is

All active cominations of index-names are not used within the variable being defined
(i.e. leftside) of this assignment-stmt. [PHIA]

The inner FORALL breaks the "No element of an array can be assigned a value more than once" rule. From what I understand, I must use DO loops in order to do the accumulating sum properly.

In almost all cases, nCells > nMas x (the only exceptions occur with fake debugging cases). The value of nMax is typically less than 100 and nCells can be in the thousands. What is the "best" way to code an accumulating sum that will enable the optimizer to maximize performance?

0 Kudos
3 Replies
Steven_L_Intel1
Employee
1,945 Views

FORALL can be thought of, conceptually, as a "parallel DO". The rules are set up so that there is no interaction between executions of the FORALL body. The way you coded it, there is a dependency among all the invocations in the inner FORALL.

Your best bet is to code it simply in a DO loop. This will give the compiler the best opportunity to vectorize and otherwise optimize it.

0 Kudos
TimP
Honored Contributor III
1,945 Views
It looks as if you could replace the inner loop with DOT_PRODUCT(). Then, it may make little difference whether the outer loop is DO or FORALL, except that OpenMP parallel do would be available for use with DO. Why not use MATMUL, or ?GEMM from one of the optimized BLAS libraries such as MKL?
0 Kudos
Dishaw__Jim
Beginner
1,945 Views
I didn't even think about the BLAS call--thanks for the reminder
0 Kudos
Reply