- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
why isn't the compiler associating the following cycle statement with the appropriate do concurrent loop?:
program test implicit none integer :: i,j,k do concurrent (i=1:10000) k=-10 do j=1,i k=k+k**j end do if (mod(i,3).eq.0) then k=int(sqrt(real(k))) write(*,*)'For i = ',i,', k set to ',k,', cycling' cycle end if write(*,*)'Block ended with i = ',i end do end program
I get the error:
cycletest.f90(13): error #6602: A CYCLE or EXIT statement must not be used with a non-DO block. cycle ------------^ compilation aborted for cycletest.f90 (code 1)
It works when I name the loop and cycle it with its name (tested with ifort 17.0).
Thanks for help!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think this is a compiler bug. It is probably not looking past the IF to see the enclosing DO construct.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This was found internally and is fixed in the next major release due out later this year.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, thanks for the info.
I'm just a bit surprised that such an error wasn't found earlier. That would implicate that only few people are actually using the do-concurrent language feature. Also, I've heard already some people complaining that do-concurrent would be in some manner a misfeature, at least what's the implicit private/shared rules concerning.
I personally would like to rely heavily on the do-concurrent feature in the future and would like to prefer it over openmp because:
- it's a native language feature
- it's probably the most minimal way to tell the compiler that there aren't any loop-carried dependencies
- the programmer does not have to distinct between vectorization/parallelization (afaik)
- the compiler has every freedom to decide whether it's worthwhile to vectorize/parallelize (so, the programmer does not have to know and the code gets more hardware independent)
So my node-level programming paradigms by now would be:
- use array (:) syntax whenever possible
- when not, use short do concurrent loops (to replace obsolete forall constructs, trying to use array syntax internally)
- use do concurrent also for more complex loops, whenever I know that there aren't any loop-carried dependencies, relying on implicit shared/private rules (also for arrays, types, object, etc.)
- try not to use blocks to keep difference to serial executing code as minimal as possible
- use openmp only when I explicitly know it's worthwhile and/or need some features like reductions, special scheduling, etc.
As one can see my paradigms are just as good as the do-concurrent implementation in the compiler. Therefore my (open) questions:
- Do you think my paradigms are basically well thought out?
- Do you think the do concurrent feature has good prospects and companies as Intel put great efforts in it to optimize its execution as much as possible (or have already)? (also w.r.t. future architectures like mic)
Thanks for any advice!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you intend for DO CONCURRENT to actually be concurrent (parallel), then you (effectively) limit the parallization of the application to just the DO CONCURRENT section. While you can mix auto-parallelization (DO CONCURRENT) with OpenMP, it is typically counter-productive (performance-wise) to do so. It used to be that the auto-parallelization thread pool and the OpenMP thread pool were separate pools. This may no longer be the case. In any event, the choice of separate pools or common pool is an implementation detail (which you seem to be adverse to). OpenMP is a well established standard, and supported by all major compiler manufacturers.
In your case, auto-parallelization may be sufficient. If you need, or want, more utilization of CPU resources, then go with directive parallelization.
Jim Dempsey
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page