Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

MKL Termination

Jianping_Z_
Beginner
599 Views

I have solved an optimization problem using "dsyytrs_" from MKL. The program is terminated in MKLaftera fixed times calls(I do not know the the number of calls, but I do know this number is a fixed number). The way I tested it is to solve the same problem repeatedly, and the termination always happened on my 96 times

I wasasking Microsoft to help analysis the cause of termination, and we were told that because of the function __kmp_register_root from MKL. I have copy the analysis from Microsoft as follows. Any one has any clue?

------------------------------------------


I just thought I would summarize all that we spoke on the phone today for your reference. The first dump that you had sent us is good enough. Your code is directly calling exit/ExitProcess and the application is not terminating because of an exception as we thought initially. This explains the behavior we are seeing of not seeing a second chance exception in the dump.



Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root is the function which is causing the termination of the application since condition "gtid < __kmp_threads_capacity" is not true. Looks like __kmp_threads_capacity is 64 and we are passing a value greater the 64. Since we do not have private symbols it's not very clear what the actual value is. The code is in kmp_runtime.c line # 4356.



I would suggest you to debug this particular code to figure out the problem. You can also set a breakpoint on the __kmp_debug_assert in function __kmp_register_root() and back track from there. Since you have to run x number of iterations before problem happens, the link http://msdn2.microsoft.com/en-us/library/yy96wbwd(vs.80).aspx will be useful to specify how many times to skip a breakpoint before breaking or as we had done before, you can attach the debugger after a specified number of counts to debug further.



Also "native" debugging should help us figure out the problem in this scenario.





Looks like your code is calling exit directly as mentioned earlier. So this has to do with some logic in the code and not any exception. Looking at the code for Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root:





Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0x77:

093a168b 8b1580c73c09 mov edx,dword ptr [Brooks_EES_R2R_Core_R2RKernel!__kmp_threads (093cc780)]

093a1691 83c301 add ebx,1

093a1694 8b349a mov esi,dword ptr [edx+ebx*4]

093a1697 8d0c1b lea ecx,[ebx+ebx]

093a169a 03c9 add ecx,ecx

093a169c 85f6 test esi,esi

093a169e 75eb jne Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0x77 (093a168b)



Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0x8c:

093a16a0 894df0 mov dword ptr [ebp-10h],ecx



Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0x8f:

093a16a3 3b1d10c83c09 cmp ebx,dword ptr [Brooks_EES_R2R_Core_R2RKernel!__kmp_threads_capacity (093cc810)] Comparison with __kmp_threads_capacity

093a16a9 7c17 jl Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0xae (093a16c2)



Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0x97:

093a16ab 6804110000 push 1104h

093a16b0 68009e3d09 push offset Brooks_EES_R2R_Core_R2RKernel!`string' (093d9e00)

093a16b5 6880ac3d09 push offset Brooks_EES_R2R_Core_R2RKernel!`string' (093dac80)

093a16ba e8252a0000 call Brooks _EES_R2R_Core_R2RKernel!__kmp_debug_assert (093a40e4)

093a16bf 83c40c add esp,0Ch





0:000> da 093d9e00

093d9e00 "../kmp_runtime.c"

0:000> da 093dac80

093dac80 "gtid < __kmp_threads_capacity"



0:000> x Brooks_EES_R2R_Core_R2RKernel!*threads_capacity*

093cc810 Brooks_EES_R2R_Core_R2RKernel!__kmp_threads_capacity =



0:000> dd 093cc810 l1

093cc810 00000040

0:000> ? 40

Evaluate expression: 64 = 00000040 Value of __kmp_threads_capacity that we with compare here







0:000> kb100

ChildEBP RetAddr Args to Child

075dda10 7c90e89a 7c81cd96 ffffffff 00000001 ntdll!KiFastSystemCallRet

075dda14 7c81cd96 ffffffff 00000001 00000001 ntdll!NtTerminateProcess+0xc

075ddb10 7c81cdee 00000001 77e8f3b0 ffffffff kernel32!_ExitProcess+0x62

075ddb24 79fa646a 00000001 00000001 00000001 kernel32!ExitProcess+0x14

075ddd4c 79fa6496 00000001 075ddd90 79f1c738 mscorwks!SafeExitProcess+0x11a

075ddd58 79f1c738 075ddddc 79fa3666 e84b8411 mscorwks!HandleExitProcessHelper+0x25

075ddde8 7901145b 00000001 79e80000 075dde0c mscorwks!CorExitProcess+0x242

075dddf8 10202269 00000001 79011416 79000000 mscoree!CorExitProcess+0x46

075dde0c 10202048 00000001 a987f4d1 00000001 msvcr80d!__crtCorExitProcess+0x39

075dde48 10201e00 00000001 00000000 00000000 msvcr80d!doexit+0x48

075dde5c 093a4141 00000001 00000000 075dde94 msvcr80d!exit+0x10

075dde6c 093a16bf 093dac80 093d9e00 00001104 Brooks_EES_R2R_Core_R2RKernel!__kmp_debug_assert+0x5d

075dde94 0939d329 00000000 0b34e028 075ddeac Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0xab

075ddea4 0939bd2a 075de17c 09359c78 093c774c Brooks_EES_R2R_Core_R2RKernel!__kmp_get_global_thread_id_reg+0x4d

075ddeac 09359c78 093c774c 0069004c 0065006e Brooks_EES_R2R_Core_R2RKernel!__kmpc_global_thread_num+0x16

075de17c 0934e9e6 075de343 075de1bc 075de1b0 Brooks_EES_R2R_Core_R2RKernel!mkl_blas_p4_dgemm+0x14





0 Kudos
1 Reply
Intel_C_Intel
Employee
599 Views

It appears that you discussed this problem, possibly with someone from the compiler team which is responsible for libguide, and in which are the kmp functions. Is there more to resolve?

Bruce

0 Kudos
Reply