- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have solved an optimization problem using "dsyytrs_" from MKL. The program is terminated in MKLaftera fixed times calls(I do not know the the number of calls, but I do know this number is a fixed number). The way I tested it is to solve the same problem repeatedly, and the termination always happened on my 96 times
I wasasking Microsoft to help analysis the cause of termination, and we were told that because of the function __kmp_register_root from MKL. I have copy the analysis from Microsoft as follows. Any one has any clue?
------------------------------------------
I just thought I would summarize all that we spoke on the phone today for your reference. The first dump that you had sent us is good enough. Your code is directly calling exit/ExitProcess and the application is not terminating because of an exception as we thought initially. This explains the behavior we are seeing of not seeing a second chance exception in the dump.
Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root is the function which is causing the termination of the application since condition "gtid < __kmp_threads_capacity" is not true. Looks like __kmp_threads_capacity is 64 and we are passing a value greater the 64. Since we do not have private symbols it's not very clear what the actual value is. The code is in kmp_runtime.c line # 4356.
I would suggest you to debug this particular code to figure out the problem. You can also set a breakpoint on the __kmp_debug_assert in function __kmp_register_root() and back track from there. Since you have to run x number of iterations before problem happens, the link http://msdn2.microsoft.com/en-us/library/yy96wbwd(vs.80).aspx will be useful to specify how many times to skip a breakpoint before breaking or as we had done before, you can attach the debugger after a specified number of counts to debug further.
Also "native" debugging should help us figure out the problem in this scenario.
Looks like your code is calling exit directly as mentioned earlier. So this has to do with some logic in the code and not any exception. Looking at the code for Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root:
Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0x77:
093a168b 8b1580c73c09 mov edx,dword ptr [Brooks_EES_R2R_Core_R2RKernel!__kmp_threads (093cc780)]
093a1691 83c301 add ebx,1
093a1694 8b349a mov esi,dword ptr [edx+ebx*4]
093a1697 8d0c1b lea ecx,[ebx+ebx]
093a169a 03c9 add ecx,ecx
093a169c 85f6 test esi,esi
093a169e 75eb jne Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0x77 (093a168b)
Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0x8c:
093a16a0 894df0 mov dword ptr [ebp-10h],ecx
Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0x8f:
093a16a3 3b1d10c83c09 cmp ebx,dword ptr [Brooks_EES_R2R_Core_R2RKernel!__kmp_threads_capacity (093cc810)] Comparison with __kmp_threads_capacity
093a16a9 7c17 jl Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0xae (093a16c2)
Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0x97:
093a16ab 6804110000 push 1104h
093a16b0 68009e3d09 push offset Brooks_EES_R2R_Core_R2RKernel!`string' (093d9e00)
093a16b5 6880ac3d09 push offset Brooks_EES_R2R_Core_R2RKernel!`string' (093dac80)
093a16ba e8252a0000 call Brooks
_EES_R2R_Core_R2RKernel!__kmp_debug_assert (093a40e4)
093a16bf 83c40c add esp,0Ch
0:000> da 093d9e00
093d9e00 "../kmp_runtime.c"
0:000> da 093dac80
093dac80 "gtid < __kmp_threads_capacity"
0:000> x Brooks_EES_R2R_Core_R2RKernel!*threads_capacity*
093cc810 Brooks_EES_R2R_Core_R2RKernel!__kmp_threads_capacity =
0:000> dd 093cc810 l1
093cc810 00000040
0:000> ? 40
Evaluate expression: 64 = 00000040 Value of __kmp_threads_capacity that we with compare here
0:000> kb100
ChildEBP RetAddr Args to Child
075dda10 7c90e89a 7c81cd96 ffffffff 00000001 ntdll!KiFastSystemCallRet
075dda14 7c81cd96 ffffffff 00000001 00000001 ntdll!NtTerminateProcess+0xc
075ddb10 7c81cdee 00000001 77e8f3b0 ffffffff kernel32!_ExitProcess+0x62
075ddb24 79fa646a 00000001 00000001 00000001 kernel32!ExitProcess+0x14
075ddd4c 79fa6496 00000001 075ddd90 79f1c738 mscorwks!SafeExitProcess+0x11a
075ddd58 79f1c738 075ddddc 79fa3666 e84b8411 mscorwks!HandleExitProcessHelper+0x25
075ddde8 7901145b 00000001 79e80000 075dde0c mscorwks!CorExitProcess+0x242
075dddf8 10202269 00000001 79011416 79000000 mscoree!CorExitProcess+0x46
075dde0c 10202048 00000001 a987f4d1 00000001 msvcr80d!__crtCorExitProcess+0x39
075dde48 10201e00 00000001 00000000 00000000 msvcr80d!doexit+0x48
075dde5c 093a4141 00000001 00000000 075dde94 msvcr80d!exit+0x10
075dde6c 093a16bf 093dac80 093d9e00 00001104 Brooks_EES_R2R_Core_R2RKernel!__kmp_debug_assert+0x5d
075dde94 0939d329 00000000 0b34e028 075ddeac Brooks_EES_R2R_Core_R2RKernel!__kmp_register_root+0xab
075ddea4 0939bd2a 075de17c 09359c78 093c774c Brooks_EES_R2R_Core_R2RKernel!__kmp_get_global_thread_id_reg+0x4d
075ddeac 09359c78 093c774c 0069004c 0065006e Brooks_EES_R2R_Core_R2RKernel!__kmpc_global_thread_num+0x16
075de17c 0934e9e6 075de343 075de1bc 075de1b0 Brooks_EES_R2R_Core_R2RKernel!mkl_blas_p4_dgemm+0x14
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It appears that you discussed this problem, possibly with someone from the compiler team which is responsible for libguide, and in which are the kmp functions. Is there more to resolve?
Bruce

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page