Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
26734 Discussions

Help a newbie with DO CONCURRENT: how to convert OpenMP code

Matt_Thompson
Novice
165 Views

All,

I have some standard code that I OMPized (and MPIized, and GPUized and MICized and...) and figured I should try DO CONCURRENT as well. Now, my first naive attempt was to replace:

!$omp parallel do default(private) &
!$omp shared(m,np,ict,icb,nb,overcast) &
...
!$omp shared(caib, caif)

      RUN_LOOP: do i=1,m

with:

      RUN_LOOP: do concurrent (i=1:m)

Now, in doing so, the code does run, but it isn't parallel at all. I can setenv OMP_NUM_THREADS to 4 or 28 and no difference in speed.

This was compiling with -qopenmp. In my desire to make some effect, I tried using -qopenmp -parallel. Now, this definitely spawned threads, but it did so in a bad way: OMP_NUM_THREADS=1 took ~5 seconds, OMP_NUM_THREADS=4 took ~12 seconds. 

So, is there a nice standard treatise/tutorial on how to take a code that works with OpenMP and convert to use DO CONCURRENT?

0 Kudos
4 Replies
Matt_Thompson
Novice
165 Views

Note: I add that I did have to ensure the compiler all my subroutines were pure. Which I'm fairly certain they are (all nicely INTENTed and everything), but they are big subroutines...

Steve_Lionel
Black Belt Retired Employee
165 Views

In Intel's compiler, DO CONCURRENT does not parallelize unless -parallel is set. But there's no guarantee of parallelization even so - it depends on whether the compiler thinks it is safe and effective. Unlike with OpenMP, there is not (yet - coming in Fortran 2018) syntax to specify the locality of variables within the loop. That you have a number of shared variables makes me suspect that the compiler did not think parallelization was safe.

Matt_Thompson
Novice
165 Views

Steve,

Thanks. That was our thought. We tried having fun with '-par-threshold=0' and other options, but it just never worked. I mean, it changed the optrpt, but nothing else.

As an aside, when 2018 is supported, what exactly will the spec look like? A colleague and I tried to parse the standard and we think:

DO CONCURRENT (i=1:m) LOCAL(x,y,z) LOCAL_INIT(q,r,s) SHARED(g,h,i)

but we aren't sure. I guess I need a tldr (https://github.com/tldr-pages/tldr) for the man page that is the Standard...which I guess are the Metcalf or Brainerd books. :)

Steve_Lionel
Black Belt Retired Employee
165 Views

Yes, that's the correct syntax.

Reply