Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

executable hanging in OpenMP region

Michael_Barkhudarov
3,807 Views

We updated from version 13.1 of the compiler to 15.0.2.179 for a large CFD FORTRAN code and started having cases of the solver hanging: sitting in memory with 0% CPU used. Here are the observations so far:

1. happens only on Windows.

2. running on a single thread works fine.

3. the executable runs for a while before it hangs. restarting the solver from an intermediate time runs fine.

4. for the cases we have so far, it hangs in different parts of the code, but for a given case it always hangs at the same location.

Here is a code snippet where it happens in one of the cases. This is a sub-section of a large subroutine. This code is invoked several million times before the solver hangs:

c Here we are adding a column to the Hessenburg matrix (hmatrix)
        do kk=1,igfy
          hmatrix(kk,igfy)=zero
          do n=1,numthrds
            flp_lcl(1,n)=hmatrix(kk,igfy)
          enddo
c
          do nbl=1,nblcks
! divide the loop between threads
            call load_bal(0,ijklim(nbl,1),nijkpr(nbl)) ! NBL is a global variable, comes from a module
!$omp parallel do schedule(static,1)
!$omp& private(n,nn,ijk)
            do n=1,numthrds ! loop over threads
              do nn=klo(n),khi(n)
                ijk=ijkpr(nn)
                flp_lcl(1,n)=flp_lcl(1,n)+vvect(ijk,igfyp1)*vvect(ijk,kk)
              enddo

! PUTTING A PRINT STATEMENT HERE PRINTS ALL THREADS.
            enddo ! THIS IS THE LINE WHERE IT HANGS.
          enddo

Could this have been resolved in version 15.4 or 16.0 of the compiler?

Thank you for any help you can provide.

Michael

0 Kudos
29 Replies
Michael_Barkhudarov
925 Views

Martyn, the test ran successfully with 13.1 and 16 compilers, but hang with 15, on the Windows machine. I compiled it with the 15 compiler for all three runs, but ran with different environments, linking to the respective dynamic libs. So something did certainly get fixed in libiomp5md.dll.

Where do we go from here? Could there be another problem? Could setting KMP_BLOCKTIME to a number larger than 200 help?

0 Kudos
Martyn_C_Intel
Employee
925 Views

Michael,

             Yes, this might be another problem, in either the OpenMP RTL or system libraries. Ideally, to make progress, we'd like to be able to reproduce a problem ourselves. I understand that yours is a large application. Would you be willing to send us that source file, prsitc_gmres_p.F, along with the source for load_bal() ? It could be via a channel other than this forum, if you prefer. We'd also like to see your command line. I don't need to be able to compile the file directly, but I need to know the important data types - real(4) or real(8) or complex; integer(4) or (8); important derived types, if any; whether arrays such as flp_lcl, vvect or ijkpr  are allocatable, pointers or dummy arguments. If we're lucky, that may be enough to construct a reproducer.

             KMP_BLOCKTIME wouldn't help if the issue is similar to the previous OpenMP RTL issue, though it might if it relates to the problem described by Jim.

0 Kudos
Michael_Barkhudarov
925 Views

Martyn,

A couple of updates. KMP_BLOCKTIME=1000 did not help, your FORTRAN test case hang at exactly the same time. Also, I ran our code in the version 13.1 compiler environment and it ran successfully.

Yes, I can send you parts of the source code. Please let me know the channel.

0 Kudos
Martyn_C_Intel
Employee
925 Views

Michael,

              KMP_BLOCKTIME is not expected to help my Fortran test case compiled with 15.0, that's a library bug. I think you said that this test case worked for you with 16.0 on Windows (as it does for me). As I understand it, your main code is still failing with 16.0.

              Best way to send source code is by attaching to a new issue in Intel Premier Support at https://premier.intel.com, if you or a colleague have an account, and mention my name. Otherwise, you can attach to email if security is not a concern, or use Intel's anonymous (and unencrypted) FTP server, and inform me by email (files may only persist for 24 hours). There are also secure transfer methods, but these require some organizational setup - it's simpler to create an Intel Premier Support account.

0 Kudos
Michael_Barkhudarov
925 Views

Martyn,

You are correct on both statements for the 16th compiler. I will use an account in Premier Support to send you the files.

Thank you!

0 Kudos
Michael_Barkhudarov
925 Views

Just created an issue in Premier Support, Martyn. The contact name there is John Ditter.

Thank you for your help!

0 Kudos
pbkenned1
Employee
925 Views

I'm helping Martyn with this case.  First, thanks Michael for creating the IPS ticket, and for providing the source code.  Martyn created a small reproducer, and I confirmed the hang on our dual-socket Intel Zeon E5 2680 (Sandy-Bridge EP/EX), with two hyper-threaded 8-core processors (32 hardware threads) running Windows Server 2008 R2, Visual Studio 2013, ifort Version 16.0.0.110. 

John has been informed via the IPS ticket.  We'll keep this thread updated with any developments.  Tracking ID:  DPD200376593

Patrick
 

0 Kudos
john-l-ditter
Beginner
925 Views

Thank you for the update, Patrick. This is great news!

Michael

0 Kudos
Martyn_C_Intel
Employee
925 Views

The test case allowed us to identify a 32/64 bit issue in the run-time library. It has been fixed in the latest version of the compiler, 16.0 update 1 (16.0.1.146). This compiler is available for download from the Intel Registration Center as part of Intel Parallel Studio XE 2016 update 1,  posted 12 Nov 2015. We believe this resolves the problem, but please let us know if you see any further issues.

Martyn

 

0 Kudos
Reply