Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

A warning when i used omp language

Pemg-Yu_C_
Beginner
8,745 Views

Dear all:

Recently, i try to use openmp language through the Intel Visual Fortran.

Because i am a beginner of the parallel computing, i decided to make the do-loops faster by using multiprocessor.

I read a lot of information about the openmp, but had this warning when compiling.

warning #10247: explicit static allocation of locals specified, overriding OpenMP*'s implicit auto allocation

I really don't know where the problem is. 

If there are any suggestion, please help me.

Thank you very much

--------------------------------part of my program------------------------------------------------

!$omp parallel do default(shared) private(k,d_epsc,cgmci,Eta1)
       do k = 1,nsteel(i)       !the half of the section
            d_epsc=dd_defN(m)-zz_steel(k,i)*dd_defMy(m)+
     +      yy_steel(k,i)*dd_defMz(m)

            
          call FrontSteel(i,Fiber,Fbmat,d_epsc,cgmci,k,m,nsm,
     +           Eta1,time,repet,kfc,istep,nskip,secfail,ff_change)

!        computing axial force and bending moment of each fiber of the section 
        f11(k) = cgmci*aa_steel(k,i)
        fmy(k) = -cgmci*zz_steel(k,i)*aa_steel(k,i)
          fmz(k) = cgmci*yy_steel(k,i)*aa_steel(k,i)
 !        summation of axial force and bending moment of all fibers of the section 
        ss_sum11 = ss_sum11 + f11(k)
        ss_sum21 = ss_sum21 + fmy(k)
        ss_sum31 = ss_sum31 + fmz(k)

          f11(k)=0.
          fmy(k)=0.
        fmz(k)=0.
          d_epsc=0.


          end do      
      !$omp end parallel do

-----------------------------------------------------------------------------------------------------------------------

 

0 Kudos
1 Solution
jimdempseyatthecove
Honored Contributor III
8,712 Views

I am guessing you are not using IMPLICIT NONE, if not, please add that to the top of your subroutine (then fix the declarations).

A second issue you have is the summations are going to have data races

!$omp parallel do default(shared) private(k,d_epsc,cgmci,Eta1) reduction(+:ss_sum11, ss_sum21, ss_sum31)

The reduction clause for operator +, provides for a private copy of the named variables, each zeroed, for use within the parallel region. On exit from the parallel region the operator + is performed in a thread safe manner to the shared variable in the scope outside the parallel region.

Jim Dempsey

View solution in original post

0 Kudos
30 Replies
TimP
Honored Contributor III
2,242 Views

I'm not certain this part of your source code helps to explain the points you asked about.

This does raise some concerns:

1) why did the compiler report OpenMP DEFINED LOOP WAS PARALLELIZED in spite of the syntax errors at that point?  If you made those arrays private, in spite of the errors, that would probably require increasing OMP_STACKSIZE as well as setting /link /stack:  to a suitable value.  Note that beginners often set OMP_STACKSIZE so large that at most 1 or 2 threads could run. 

As you apparently perform sum reductions, the reduction clause would be mandatory.  You probably need Inspector to have such errors pointed out automatically.

If you intended private arrays to be initialized to values inherited from outside the parallel region, you would need explicit copyin or firstprivate.  I don't think any syntax checker would show you that.

2) It looks difficult to debug combinations of IBM360 non-standard style  with Fortran 90 and OpenMP usage.  In particular, why are you using the 40 year old style of assumed size declarations  (and what ifort option would raise warnings?).

0 Kudos
jimdempseyatthecove
Honored Contributor III
2,242 Views

>>As about the data races condition, i am afraid that i didn't notice this problem.

With data races you won't notice the problem (program crash) other than for screwy results. Also note, some race conditions are often not detected as you develop the code, rather the race condition is not discovered until the code is in production.

An alternative to using reduction is by way of !$OMP ATOMIC or !$OMP CRITICAL (youNameItHere). When used in a loop, they are generally much slower than using the reduction clause. The reduction clause only has # threads number of atomic/critical operations, the other two have loop iteration count (possibly divided by simd width) number of atomic/critical operations. The other two are useful in situations where you want the other threads to be notified immediately (e.g. parallel search).

Jim Dempsey

0 Kudos
Pemg-Yu_C_
Beginner
2,242 Views

Dear all

Thanks for your suggestions, i am facing another problem now.

Once i increased the stack size, it showed the message liked below.

[Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll]

     kernel32.dll!7561338a()  

     ntdll.dll!77b89f72()     

     ntdll.dll!77b89f45()     

 Without the !$omp languages, there was no problem.

How could i fix it?

Pemg-Yu Chen

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
2,242 Views

When you get those error messages, look at the Call Stack. Usually this is shown in a tab at the bottom of the Visual Studio IDE. If you do not see it, click on: Debug, Windows, Call Stack

Look in the Call Stack, reading from bottom up, until you hit the first line with what looks like part of your program. Double Click on that and the source line should appear in the source code window.

Jim Dempsey

0 Kudos
Pemg-Yu_C_
Beginner
2,242 Views

Dear Mr.Jim

Thanks for your suggestion, it seems like the problem is due to the call for a subroutine in the !$omp parallel region.

What may cause the message?

[Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll]

     kernel32.dll!7561338a()  

     ntdll.dll!77b89f72()     

     ntdll.dll!77b89f45()

Thank you very much.

Pemg-Yu Chen

0 Kudos
jimdempseyatthecove
Honored Contributor III
2,242 Views

Those refer to the call stack from the start of the thread..CallStack.png

The above is for a main thread, yours may root in the startup of an OpenMP thread pool thread. In the call stack, the bold are generally those of your application, the greyed are typically those used to get the thread going.

If you do not see any portion of your program at the top of the stack, then something occurred during thread startup. This could be something like having an undefined reference in a private, copyin, firstprivate, reduction, (others).

Jim Dempsey

0 Kudos
Pemg-Yu_C_
Beginner
2,242 Views

Dear Mr Jim:

I declared my variables after i change the "implicit real" to "implicit none" following your suggestion at the beginning.

If i set the "Local Variable Storage" as "ALL Variables SAVE", the program can run, but the result was wrong.

At the same time, it will has the warning "explicit static allocation of locals specified, overriding OpenMP*'s implicit auto allocation".

If i set the "Local Variable Storage" as "LOCAL VARIABLE AUTOMATIC", the program can not run, and the error showed as before.

What is the meaning of the setting, which is wright, and how can i modify my program?

Thank you very much.

 

Pemg-Yu Chen

0 Kudos
TimP
Honored Contributor III
2,242 Views

Procedures called in a parallel region must have local variables and arrays automatic (both scalars which will be automatic unless you set /Qsave, and arrays, which will not be automatic unless you set /QopenMP, /Qauto, or declare the procedure with RECURSIVE.  Of course, these could expose problems with undefined variables. It's advisable to check that your program runs correctly with /Qauto /Qsave- before attempting OpenMP.

0 Kudos
jimdempseyatthecove
Honored Contributor III
2,242 Views

Dear Mr. Chen

In making a quick look at your program you have:

ss_sum11 = ss_sum11 + f11(k)
ss_sum21 = ss_sum21 + fmy(k)
ss_sum31 = ss_sum31 + fmz(k)

this will present a race condition when multiple threads attempt to update the sum values at the same time.

To correct for this, you will need to use (add) the "reduction(+:ss_sum11, ss_sum21, ss_sum31)" clause to your !$omp statement.

*** you also have other summation variables not listed above, those will have to be added to the reduction clause too

I have not looked at subroutine FrontSteel. It too may have a conflicting use of variables. If it has SAVE'ed local variables to be carried from call to call, it too will be in error.

Jim Dempsey

0 Kudos
TimP
Honored Contributor III
2,242 Views

If you're trying to parallelize a program by trial and error, the Intel Parallel Studio tools Advisor and Inspector ought to be helpful.

0 Kudos
Reply