Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Help a newbie with DO CONCURRENT: how to convert OpenMP code

Matt_Thompson
Novice
495 Views

All,

I have some standard code that I OMPized (and MPIized, and GPUized and MICized and...) and figured I should try DO CONCURRENT as well. Now, my first naive attempt was to replace:

!$omp parallel do default(private) &
!$omp shared(m,np,ict,icb,nb,overcast) &
...
!$omp shared(caib, caif)

      RUN_LOOP: do i=1,m

with:

      RUN_LOOP: do concurrent (i=1:m)

Now, in doing so, the code does run, but it isn't parallel at all. I can setenv OMP_NUM_THREADS to 4 or 28 and no difference in speed.

This was compiling with -qopenmp. In my desire to make some effect, I tried using -qopenmp -parallel. Now, this definitely spawned threads, but it did so in a bad way: OMP_NUM_THREADS=1 took ~5 seconds, OMP_NUM_THREADS=4 took ~12 seconds. 

So, is there a nice standard treatise/tutorial on how to take a code that works with OpenMP and convert to use DO CONCURRENT?

0 Kudos
4 Replies
Matt_Thompson
Novice
495 Views

Note: I add that I did have to ensure the compiler all my subroutines were pure. Which I'm fairly certain they are (all nicely INTENTed and everything), but they are big subroutines...

0 Kudos
Steve_Lionel
Honored Contributor III
495 Views

In Intel's compiler, DO CONCURRENT does not parallelize unless -parallel is set. But there's no guarantee of parallelization even so - it depends on whether the compiler thinks it is safe and effective. Unlike with OpenMP, there is not (yet - coming in Fortran 2018) syntax to specify the locality of variables within the loop. That you have a number of shared variables makes me suspect that the compiler did not think parallelization was safe.

0 Kudos
Matt_Thompson
Novice
495 Views

Steve,

Thanks. That was our thought. We tried having fun with '-par-threshold=0' and other options, but it just never worked. I mean, it changed the optrpt, but nothing else.

As an aside, when 2018 is supported, what exactly will the spec look like? A colleague and I tried to parse the standard and we think:

DO CONCURRENT (i=1:m) LOCAL(x,y,z) LOCAL_INIT(q,r,s) SHARED(g,h,i)

but we aren't sure. I guess I need a tldr (https://github.com/tldr-pages/tldr) for the man page that is the Standard...which I guess are the Metcalf or Brainerd books. :)

0 Kudos
Steve_Lionel
Honored Contributor III
495 Views

Yes, that's the correct syntax.

0 Kudos
Reply