Community
cancel
Showing results for 
Search instead for 
Did you mean: 
AndrewC
New Contributor I
104 Views

A problem with MKL 11.2 Update 3 and ddot_direct

Our software is failing QA on a 8 cores/thread system and is hanging in the main thread. The culprit seems to be ddot_direct. No other user threads are running. This is a new problem introduced with the new <D,Z,S,C>dot direct calls.

     ntdll.dll!NtWaitForSingleObject()  + 0xa bytes    
     KernelBase.dll!WaitForSingleObjectEx()  + 0x9c bytes    
     libiomp5md.dll!__kmp_suspend_64()  + 0x1c0 bytes    
     libiomp5md.dll!__kmp_barrier()  + 0x32d0 bytes    
     libiomp5md.dll!__kmp_join_barrier()  + 0x5fe bytes    
     libiomp5md.dll!__kmp_join_call()  + 0xf1 bytes    
     libiomp5md.dll!__kmpc_fork_call()  + 0x76 bytes    
     mkl_intel_thread.dll!000007fed54250c3()     
     [Frames below may be incorrect and/or missing, no symbols loaded for mkl_intel_thread.dll]    
     mkl_intel_thread.dll!000007fed539a8ba()     
     ddot_direct()  + 0x74 bytes    

 

0 Kudos
9 Replies
Gennady_F_Intel
Moderator
104 Views

Vasci, Could you give more details about the parameters about this function?   

How do you link this case?

any specific CPU where the problem has happened?

 

AndrewC
New Contributor I
104 Views

 

Parameters as below ( n=2576, incx=1, incy=1)

-        n    0x000000000012ccf0    const int *            2576    const int

-        x    0x000000ef58eca600    const double *  2.1437157556647435e-005    const double

-        incx    0x000000000012cd00    const int *  1    const int
-        y    0x0000000045184380    const double *  4.4600692790355519e-016    const double
-        incy    0x000000000012cd0c    const int *  1    const int
        ret    0.00000000000000000    double

It is being linked with Parallel MKL on Windows 64 in Visual Studio.

Intel Xeon 3.6GHZ also happens on other Xeon machines.

SSE    :Y
SSE2   :Y
SSE3   :Y
SSSE3  :Y
SSE41  :Y
SSE42  :Y
AVX    :Y
AVX2   :N
----------

OS Enabled AVX :Y
AES            :Y
CLMUL          :Y
RDRAND         :Y
F16C           :Y
Maximum number of OpenMP threads:8
MKL Version:Intel(R) Math Kernel Library Version 11.2.3 Product Build 20150413 for Intel(R) 64 architecture applications

Failing at this call to doot_direct

/* {S,D}DOT_DIRECT */
static __inline double mkl_dc_ddot_convert(const MKL_INT *n, const double* x, const MKL_INT *incx, const double *y, const MKL_INT *incy) {
    double ret = 0.0;
    if (MKL_DC_DDOT_CHECKSIZE(n)) {
        ret = mkl_dc_ddot((n), (x), (incx), (y), (incy));
    } else {
        ret = ddot_direct((n), (x), (incx), (y), (incy));
    }
    return ret;
}

As I said, there are no other user threads running at the time, this is being called from the main thread.

 

 

AndrewC
New Contributor I
104 Views

I have removed the 'direct' calls so that the 'regular' DDOT is called. Interestingly the problems persist.

The issue is 100% reproducible, and only can be worked around by setting OMP_NUM_THREADS=1

 

AndrewC
New Contributor I
104 Views

FYI, code locked at

     ntdll.dll!NtWaitForSingleObject()  + 0xa bytes    
     KernelBase.dll!WaitForSingleObjectEx()  + 0x9c bytes    
     libiomp5md.dll!__kmp_suspend_64()  + 0x1c0 bytes    
     libiomp5md.dll!__kmp_barrier()  + 0x32d0 bytes    
     libiomp5md.dll!__kmp_join_barrier()  + 0x5fe bytes    
     libiomp5md.dll!__kmp_join_call()  + 0xf1 bytes    
     libiomp5md.dll!__kmpc_fork_call()  + 0x76 bytes    
     mkl_intel_thread.dll!000007fedf2450c3()     
     [Frames below may be incorrect and/or missing, no symbols loaded for mkl_intel_thread.dll]    
     mkl_intel_thread.dll!000007fedf1ba8ba()     
     ddot()  + 0x83 bytes    

 

AndrewC
New Contributor I
104 Views

Further analysis

  • Replacing DDOT with my 'own' naive DDOT causes the issue to go away - as expected
  • A simple test program with the same  input parameters does not reproduce the problem

 

AndrewC
New Contributor I
104 Views

Just FYI,

This program crashes instantly on the call to MKL_Thread_Free_Buffers(); I know its a bit perverse, but this is new in latest Update.

 

 

int _tmain(int argc, _TCHAR* argv[])
{
    int n=2576;
    double *x=(double *)malloc(sizeof(double) * n);
    double *y=(double *)malloc(sizeof(double) * n);
    int incx=1;
    int incy=1;
    for(int i=0;i<n;i++){
        x=i;
        y=i*2;
    }
    MKL_Thread_Free_Buffers();
    for(int j=0;j<10000000;j++){
        double res=ddot(&n, x, &incx, y, &incy);
    }
}

 

Sarah_K_Intel
Employee
104 Views

Thank you for your detailed analysis.  I could reproduce the crash with Intel MKL 11.2.3 on Windows when using dynamic linking (but not with static linking).  We are looking into the issue in more detail.

As a potential workaround, inserting a call to ddot (or likely to any MKL call that is sufficiently large enough for thread initialization to occur) before the MKL_Thread_Free_Buffers() call appeared to make the crash no longer occur.  Can you please see if this workaround works for you?

Explicitly, I modified your reproducer to be:

int _tmain(int argc, _TCHAR* argv[])
{
    int n=2576;
    double *x=(double *)malloc(sizeof(double) * n);
    double *y=(double *)malloc(sizeof(double) * n);
    int incx=1;
    int incy=1;
    for(int i=0;i<n;i++){
        x=i;
        y=i*2;
    }
    ddot(&n, x, &incx, y, &incy);  
    MKL_Thread_Free_Buffers();
    for(int j=0;j<10000000;j++){
        double res=ddot(&n, x, &incx, y, &incy);
    }
}

AndrewC
New Contributor I
104 Views

Hi, I had already implemented your suggested workaround - that is, do not call MKL_Thread_Free_Buffers() until some calls into MKL have been made to initialise the buffers.

I am more concerned about the ddot issue that started this thread. It's clearly a subtle issue, but it is 'new' and hopefully by comparing ddot from a previous version of MKL to the latest will show why it has arisen.

 

 

John_L_8
Beginner
104 Views

The problem of " libiomp5md.dll!__kmp_suspend_64() " does not only result from ddot, but also comes with OpenMP. And MKL_Thread_Free_Buffers() does not work usually. Who can solve this problem completely? Thank you very much.
Reply