Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

whole array operations and openmp

grs2103
Beginner
673 Views
Has anyone experimented with the workshare directive on whole array operations and ifort? I played a bit with this on gfortran, but found that it wasn't giving me any speedup, and I really need to resort to parallel do/end do loops to get a boost out of openmp.
0 Kudos
3 Replies
TimP
Honored Contributor III
673 Views
Has anyone experimented with the workshare directive on whole array operations and ifort? I played a bit with this on gfortran, but found that it wasn't giving me any speedup, and I really need to resort to parallel do/end do loops to get a boost out of openmp.
Yes, that's what I found, with both gfortran and ifort. A more common situation for benefit from OpenMP is with an outer parallel do loop containing whole array operations. The -parallel option of ifort may be able to thread a whole array operation, but doesn't judge well how to benefit from the combination of vectorization and parallel.
You might note that you need a recent gfortran with options -mtune=barcelona -msse4 to get full benefit of vectorization on recent CPUs. On some of the older CPUs, there is less benefit from combining vectorization and OpenMP, but a more common reason why people forgo vectorization is to be able to start from a lower base when bragging about parallel performance scaling.
0 Kudos
pbkenned1
Employee
673 Views
Has anyone experimented with the workshare directive on whole array operations and ifort? I played a bit with this on gfortran, but found that it wasn't giving me any speedup, and I really need to resort to parallel do/end do loops to get a boost out of openmp.

Currently with Intel Fortran, the OpenMP WORKSHARE directive is implemented with a SINGLE construct, and so no parallel code is generated. Weare consideringa threaded implementation limited to parallelizing a single FORALL construct, single WHERE construct, or a single block of F90 array assignments.

For reference, we are tracking this as compilerfeature requestDPD200045053.

Patrick Kennedy
Intel Compiler Lab
0 Kudos
pbkenned1
Employee
673 Views

Certain OpenMP* WORKSHARE constructs now parallelize with Intel® Fortran Compiler 15.0. Our implementation is described here.

Patrick

0 Kudos
Reply