- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Our software is failing QA on a 8 cores/thread system and is hanging in the main thread. The culprit seems to be ddot_direct. No other user threads are running. This is a new problem introduced with the new <D,Z,S,C>dot direct calls.
ntdll.dll!NtWaitForSingleObject() + 0xa bytes
KernelBase.dll!WaitForSingleObjectEx() + 0x9c bytes
libiomp5md.dll!__kmp_suspend_64() + 0x1c0 bytes
libiomp5md.dll!__kmp_barrier() + 0x32d0 bytes
libiomp5md.dll!__kmp_join_barrier() + 0x5fe bytes
libiomp5md.dll!__kmp_join_call() + 0xf1 bytes
libiomp5md.dll!__kmpc_fork_call() + 0x76 bytes
mkl_intel_thread.dll!000007fed54250c3()
[Frames below may be incorrect and/or missing, no symbols loaded for mkl_intel_thread.dll]
mkl_intel_thread.dll!000007fed539a8ba()
ddot_direct() + 0x74 bytes
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Vasci, Could you give more details about the parameters about this function?
How do you link this case?
any specific CPU where the problem has happened?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Parameters as below ( n=2576, incx=1, incy=1)
- n 0x000000000012ccf0 const int * 2576 const int
- x 0x000000ef58eca600 const double * 2.1437157556647435e-005 const double
- incx 0x000000000012cd00 const int * 1 const int
- y 0x0000000045184380 const double * 4.4600692790355519e-016 const double
- incy 0x000000000012cd0c const int * 1 const int
ret 0.00000000000000000 double
It is being linked with Parallel MKL on Windows 64 in Visual Studio.
Intel Xeon 3.6GHZ also happens on other Xeon machines.
SSE :Y
SSE2 :Y
SSE3 :Y
SSSE3 :Y
SSE41 :Y
SSE42 :Y
AVX :Y
AVX2 :N
----------
OS Enabled AVX :Y
AES :Y
CLMUL :Y
RDRAND :Y
F16C :Y
Maximum number of OpenMP threads:8
MKL Version:Intel(R) Math Kernel Library Version 11.2.3 Product Build 20150413 for Intel(R) 64 architecture applications
Failing at this call to doot_direct
/* {S,D}DOT_DIRECT */
static __inline double mkl_dc_ddot_convert(const MKL_INT *n, const double* x, const MKL_INT *incx, const double *y, const MKL_INT *incy) {
double ret = 0.0;
if (MKL_DC_DDOT_CHECKSIZE(n)) {
ret = mkl_dc_ddot((n), (x), (incx), (y), (incy));
} else {
ret = ddot_direct((n), (x), (incx), (y), (incy));
}
return ret;
}
As I said, there are no other user threads running at the time, this is being called from the main thread.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have removed the 'direct' calls so that the 'regular' DDOT is called. Interestingly the problems persist.
The issue is 100% reproducible, and only can be worked around by setting OMP_NUM_THREADS=1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
FYI, code locked at
ntdll.dll!NtWaitForSingleObject() + 0xa bytes
KernelBase.dll!WaitForSingleObjectEx() + 0x9c bytes
libiomp5md.dll!__kmp_suspend_64() + 0x1c0 bytes
libiomp5md.dll!__kmp_barrier() + 0x32d0 bytes
libiomp5md.dll!__kmp_join_barrier() + 0x5fe bytes
libiomp5md.dll!__kmp_join_call() + 0xf1 bytes
libiomp5md.dll!__kmpc_fork_call() + 0x76 bytes
mkl_intel_thread.dll!000007fedf2450c3()
[Frames below may be incorrect and/or missing, no symbols loaded for mkl_intel_thread.dll]
mkl_intel_thread.dll!000007fedf1ba8ba()
ddot() + 0x83 bytes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Further analysis
- Replacing DDOT with my 'own' naive DDOT causes the issue to go away - as expected
- A simple test program with the same input parameters does not reproduce the problem
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just FYI,
This program crashes instantly on the call to MKL_Thread_Free_Buffers(); I know its a bit perverse, but this is new in latest Update.
int _tmain(int argc, _TCHAR* argv[])
{
int n=2576;
double *x=(double *)malloc(sizeof(double) * n);
double *y=(double *)malloc(sizeof(double) * n);
int incx=1;
int incy=1;
for(int i=0;i<n;i++){
x=i;
y=i*2;
}
MKL_Thread_Free_Buffers();
for(int j=0;j<10000000;j++){
double res=ddot(&n, x, &incx, y, &incy);
}
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your detailed analysis. I could reproduce the crash with Intel MKL 11.2.3 on Windows when using dynamic linking (but not with static linking). We are looking into the issue in more detail.
As a potential workaround, inserting a call to ddot (or likely to any MKL call that is sufficiently large enough for thread initialization to occur) before the MKL_Thread_Free_Buffers() call appeared to make the crash no longer occur. Can you please see if this workaround works for you?
Explicitly, I modified your reproducer to be:
int _tmain(int argc, _TCHAR* argv[])
{
int n=2576;
double *x=(double *)malloc(sizeof(double) * n);
double *y=(double *)malloc(sizeof(double) * n);
int incx=1;
int incy=1;
for(int i=0;i<n;i++){
x=i;
y=i*2;
}
ddot(&n, x, &incx, y, &incy);
MKL_Thread_Free_Buffers();
for(int j=0;j<10000000;j++){
double res=ddot(&n, x, &incx, y, &incy);
}
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I had already implemented your suggested workaround - that is, do not call MKL_Thread_Free_Buffers() until some calls into MKL have been made to initialise the buffers.
I am more concerned about the ddot issue that started this thread. It's clearly a subtle issue, but it is 'new' and hopefully by comparing ddot from a previous version of MKL to the latest will show why it has arisen.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page