Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

whole array operations and openmp

grs2103
Beginner
817 Views
Has anyone experimented with the workshare directive on whole array operations and ifort? I played a bit with this on gfortran, but found that it wasn't giving me any speedup, and I really need to resort to parallel do/end do loops to get a boost out of openmp.
0 Kudos
3 Replies
TimP
Honored Contributor III
817 Views
Has anyone experimented with the workshare directive on whole array operations and ifort? I played a bit with this on gfortran, but found that it wasn't giving me any speedup, and I really need to resort to parallel do/end do loops to get a boost out of openmp.
Yes, that's what I found, with both gfortran and ifort. A more common situation for benefit from OpenMP is with an outer parallel do loop containing whole array operations. The -parallel option of ifort may be able to thread a whole array operation, but doesn't judge well how to benefit from the combination of vectorization and parallel.
You might note that you need a recent gfortran with options -mtune=barcelona -msse4 to get full benefit of vectorization on recent CPUs. On some of the older CPUs, there is less benefit from combining vectorization and OpenMP, but a more common reason why people forgo vectorization is to be able to start from a lower base when bragging about parallel performance scaling.
0 Kudos
pbkenned1
Employee
817 Views
Has anyone experimented with the workshare directive on whole array operations and ifort? I played a bit with this on gfortran, but found that it wasn't giving me any speedup, and I really need to resort to parallel do/end do loops to get a boost out of openmp.

Currently with Intel Fortran, the OpenMP WORKSHARE directive is implemented with a SINGLE construct, and so no parallel code is generated. Weare consideringa threaded implementation limited to parallelizing a single FORALL construct, single WHERE construct, or a single block of F90 array assignments.

For reference, we are tracking this as compilerfeature requestDPD200045053.

Patrick Kennedy
Intel Compiler Lab
0 Kudos
pbkenned1
Employee
817 Views

Certain OpenMP* WORKSHARE constructs now parallelize with Intel® Fortran Compiler 15.0. Our implementation is described here.

Patrick

0 Kudos
Reply