Re: OpenMP: Sun's and Ifort - Page 2

jdrodrig · ‎04-30-2009

Is there a reason for ifort openmp to find a race condition on an OpenMP DO loop when Sun's F95 OpenMP works fine?

I have been using Sun's f95 to do a value function iteration (economics problem). V_new(x)=max f(x,y)+V_old(y) where, maximization is over y such that y \in B(x) . The point is to find V_new=V_old.

I split the range of x's in eigth parts, and let OpenMP distribute the maximixation problem over 8 cores.

I have been now evaluation ifort on VS 2008 on Vista 32 and where my code always converged (and still converges) using sun's f95 OpenMP directives, ifort executable seem to linger forever.

Could you suggest which documentation should be on the top of my reading list?

jimdempseyatthecove · ‎05-01-2009

Quoting - tim18

Steve's primary suggestion was to add the standard RECURSIVE keyword, once per subroutine, rather than the non-standard AUTOMATIC to each array. Either method would protect against mistakenly allowing subroutines to be compiled with options which aren't compatible with OpenMP. It's true that the ifort default gives SAVE status to local arrays in a subroutine not marked as RECURSIVE, and that won't work in a parallel region. When everything is compiled with /Qopenmp, or any other option supporting recursion or parallel, this can't happen; if it did, it would be a serious reportable bug.
I would hope that adding RECURSIVE to all subroutines which are called in a parallel region would cause the compiler to flag any explicit use of SAVE or DATA, as well as over-riding any implicit SAVE.

Although the RECUSIVE keyword will correct the anomaly, this (these) subroutines are NOT being called recursively. The proper term is REENTRANT(ly) or alternatly you could call the routine as being required to run CONCURRENT(ly).

Regardless of if the routine isRECUSIVE, REENTRANT or CONCURRENT there is not an explicit statement or attribute (other than AUTOMATIC) that says this array (or array discriptor) must reside on the stack. SAVE means static, absense of SAVE means, well, the intentions are inconclusive.

The use of RECURSIVE, when not required, also has some unintended consequences (overhead).

Per IVF documentation

An automatic array (the array is a local variable)

However elsewhere

Automatic Arrays

An automatic array is an explicit-shape array that is a local variable. Automatic arrays are only allowed in function and subroutine subprograms, and are declared in the specification part of the subprogram. At least one bound of an automatic array must be a nonconstant specification expression. The bounds are determined when the subprogram is called.

However, in practice, IVF permits all bounds of automatic arrays to be constant. (hooray for IVF)

AUTOMATIC explicitly states and requires the array is a local variable.

My 2 cents worth.

Jim Dempsey

jdrodrig · ‎05-01-2009

Quoting - jimdempseyatthecove

Quoting - tim18

Steve's primary suggestion was to add the standard RECURSIVE keyword, once per subroutine, rather than the non-standard AUTOMATIC to each array. Either method would protect against mistakenly allowing subroutines to be compiled with options which aren't compatible with OpenMP. It's true that the ifort default gives SAVE status to local arrays in a subroutine not marked as RECURSIVE, and that won't work in a parallel region. When everything is compiled with /Qopenmp, or any other option supporting recursion or parallel, this can't happen; if it did, it would be a serious reportable bug.
I would hope that adding RECURSIVE to all subroutines which are called in a parallel region would cause the compiler to flag any explicit use of SAVE or DATA, as well as over-riding any implicit SAVE.

Although the RECUSIVE keyword will correct the anomaly, this (these) subroutines are NOT being called recursively. The proper term is REENTRANT(ly) or alternatly you could call the routine as being required to run CONCURRENT(ly).

Regardless of if the routine isRECUSIVE, REENTRANT or CONCURRENT there is not an explicit statement or attribute (other than AUTOMATIC) that says this array (or array discriptor) must reside on the stack. SAVE means static, absense of SAVE means, well, the intentions are inconclusive.

The use of RECURSIVE, when not required, also has some unintended consequences (overhead).

Per IVF documentation

An automatic array (the array is a local variable)

However elsewhere

Automatic Arrays

An automatic array is an explicit-shape array that is a local variable. Automatic arrays are only allowed in function and subroutine subprograms, and are declared in the specification part of the subprogram. At least one bound of an automatic array must be a nonconstant specification expression. The bounds are determined when the subprogram is called.

However, in practice, IVF permits all bounds of automatic arrays to be constant. (hooray for IVF)

AUTOMATIC explicitly states and requires the array is a local variable.

My 2 cents worth.

Jim Dempsey

Before breaking out for dinner; I just wanted to add, that I added RECURSIVE infront of all subroutines in my code and the problem *prevails". that is, ifort linux 11.082 and sun's f95 generate an executable that converge to the same solution and windows ifort's diverges.

I will try the automatic array idea next.

PS I have learned more fortran in the last 24 hours than in the last year...thanks again.

jdrodrig · ‎05-02-2009

Quoting - jimdempseyatthecove

Although the RECUSIVE keyword will correct the anomaly, this (these) subroutines are NOT being called recursively. The proper term is REENTRANT(ly) or alternatly you could call the routine as being required to run CONCURRENT(ly).

Regardless of if the routine isRECUSIVE, REENTRANT or CONCURRENT there is not an explicit statement or attribute (other than AUTOMATIC) that says this array (or array discriptor) must reside on the stack. SAVE means static, absense of SAVE means, well, the intentions are inconclusive.

The use of RECURSIVE, when not required, also has some unintended consequences (overhead).

Per IVF documentation

An automatic array (the array is a local variable)

However elsewhere

Automatic Arrays

An automatic array is an explicit-shape array that is a local variable. Automatic arrays are only allowed in function and subroutine subprograms, and are declared in the specification part of the subprogram. At least one bound of an automatic array must be a nonconstant specification expression. The bounds are determined when the subprogram is called.

However, in practice, IVF permits all bounds of automatic arrays to be constant. (hooray for IVF)

AUTOMATIC explicitly states and requires the array is a local variable.

My 2 cents worth.

Jim Dempsey

As a follow up:

Bad news.

I added the AUTOMATIC reference to all three local arrays within the procedure being used concurrantly. No luck.

I added recursive to all procedures first, no luck. then I simply on top of that, added the flag -recursive at compilation time. No luck.

Noob question. Is the same OMP API being used in intel fortran linux and windows? I seem to remember, Sun's f95 uses OMP 2.5. is intel visual fortran compiler using 3.0 ? could that affect my code?

Steven_L_Intel1 · ‎05-02-2009

The OMP API in terms of what you put in your source code and what it does, is an industry standard. There is no standard regarding implementation or the underlying library API. OpenMP 2.5 vs. 3.0 is just an added feature or two.

Have you tried Intel Thread Checker?

jdrodrig · ‎05-02-2009

Quoting - Steve Lionel (Intel)

The OMP API in terms of what you put in your source code and what it does, is an industry standard. There is no standard regarding implementation or the underlying library API. OpenMP 2.5 vs. 3.0 is just an added feature or two.

Have you tried Intel Thread Checker?

Thanks, I will look into it.

Something I am still puzzled about is why the "same" compiler Intel's generate one solution under Windows and other if I use Intel's Compiler for Linux.

Could you tell me what would be the most likely -or at least, one that comes to mind- factor behind this differential behavior?

Steven_L_Intel1 · ‎05-02-2009

Most likely thing that comes to mind is uninitialized variables or non-thread-safe behavior. This is why I suggested trying Intel Thread Checker - there's a 30-day free trial. You could also try a build configuration with the Static Verifier feature enabled - see the compiler documentation for details. There will likely be a lot of noise but sometimes it points out something useful.

TimP · ‎05-03-2009

Quoting - Steve Lionel (Intel)

Most likely thing that comes to mind is uninitialized variables or non-thread-safe behavior. This is why I suggested trying Intel Thread Checker - there's a 30-day free trial. You could also try a build configuration with the Static Verifier feature enabled - see the compiler documentation for details. There will likely be a lot of noise but sometimes it points out something useful.

Given that the posts 2 days ago quote ifort static verifier as reporting 10 uninitialized variables (some also reported by gfortran), along with all the other junk, and Sun f95 as reporting data races due to shared variables, I'm not certain that it's time to try Thread Checker yet. I'm also thinking that there is no willingness here to take advantage of such reports. The question seems to be how can the clearly diagnosed problems produce different results on different compilers.

If you can accept that Fortran is as vulnerable to races as C++, here is a good article:
http://developers.sun.com/solaris/articles/cpp_race.html

jimdempseyatthecove · ‎05-03-2009

Race conditions are generally not a compilers fault, rather a programming fault. Uninitialized variables, shared written variables that are not atomic, shared read variable that are read before written, etc...

Thread Checker can find some of these as can adding uninitializedvariable runtime checking. Sometimes you may need to fall back on adding debug traps into your code (conditional compiled).

Assume you have an array in which portions of the array are to be exclusively written by seperate threads, and then read some time later after being written

real, allocatable:: foo(:,:,:)

Consider adding (conditional compiled)

integer, allocatable :: fooDEBUG(:,:,:)
...
fooDEBUG = -1

Where -1 in the cell indicates never been written.
And n means written by OpenMP team member number n

In the sections of code where you write to a cell, call a subroutine to verify ownership

if(fooDEBUG(i,j,k) .ne. myThreadNum) then
if(fooDEBUG(i,j,k) .ne. -1) then
write(*,*) 'bug' ! break here
else
iTemp = InterlockedExchange(fooDEBUG(i,j,k), myThreadNum)
if(iTemp .ne. -1) then
write(*,*) 'bug' ! break here
endif
endif
endif

In places where you read a cell under the assumption that it was written by another thread

if(fooDEBUG(i,j,k) .eq. -1) then
write(*,*) 'bug' ! break here
endif

Jim Dempsey