- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am facing a runtime error while running a deep learning model on Xeon with Ubuntu 16.04 with gcc version gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4.
I1030 15:56:03.653736 115861 caffe.cpp:361] Performing Backward
*** Aborted at 1509404163 (unix time) try "date -d @1509404163" if you are using GNU date ***
PC: @ 0x7f1c75d85c40 mkl_blas_avx2_xscopy
*** SIGSEGV (@0xc35fb0) received by PID 115861 (TID 0x7f1c716cbd80) from PID 12804016; stack trace: ***
@ 0x7f1c90d2ecb0 (unknown)
@ 0x7f1c75d85c40 mkl_blas_avx2_xscopy
@ 0x7f1c79702fc5 mkl_blas_scopy
@ 0x7f1c7b2c4ac3 __kmp_invoke_microtask
@ 0x7f1c7b293257 __kmp_invoke_task_func
@ 0x7f1c7b2928d5 __kmp_launch_thread
@ 0x7f1c7b2c4fa4 _INTERNAL_26_______src_z_Linux_util_cpp_16f8393c::__kmp_launch_worker()
@ 0x7f1c8e914184 start_thread
@ 0x7f1c90df237d (unknown)
@ 0x0 (unknown)
Is there a requirement on minimum gcc version ? Any ideas on what is going wrong?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sujay,
Could you please provide detail info about CPU, caffe version (Intel caffe?), and please also export MKL_VERBOSE=1 to have a check, and paste result here. Thanks.
Best regards,
Fiona
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
CPU version : Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz
Custom caffe : https://github.com/onalbach/caffe-deep-shading
I have changed the BLAS library to MKL in the config file and provided necessary include/library paths. Is there anything more that needs to be done?
MKL_VERBOSE Intel(R) MKL 2018.0 Product build 20170720 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 2 (Intel(R) AVX2) enabled processors, Lnx 3.00GHz lp64 intel_thread NMICDev:0
MKL_VERBOSE DSCAL(64,0x7fff87de4620,0x1b8bf80,1) 3.41ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:24 WDiv:HOST:+0.000
MKL_VERBOSE SGEMM(N,N,65536,8,50,0x7fff87de4ae8,0x7fc01e382010,65536,0x1b96dc0,50,0x7fff87de4af0,0x7fc01f003010,65536) 26.74ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:24 WDiv:HOST:+0.000
MKL_VERBOSE SGEMM(N,N,65536,8,1,0x7fff87de4af8,0x7fc03c2d5010,65536,0x1b8cdc0,1,0x7fff87de4b00,0x7fc01f003010,65536) 485.24us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:24 WDiv:HOST:+0.000
MKL_VERBOSE SGEMM(N,N,65536,1,200,0x7fff87de4ae8,0x7fbfbcdff010,65536,0x1b79c80,200,0x7fff87de4af0,0x7fc01c039010,65536) 2.22ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:24 WDiv:HOST:+0.000
MKL_VERBOSE SGEMM(N,N,65536,1,1,0x7fff87de4af8,0x7fc03c294010,65536,0x1b7a1f0,1,0x7fff87de4b00,0x7fc01c039010,65536) 815.90us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:24 WDiv:HOST:+0.000
MKL_VERBOSE SCOPY(1024,0x7fc00c02d010,1,0x7fc00c02d010,1) 20.30us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:24 WDiv:HOST:+0.000
MKL_VERBOSE SSCAL(1024,0x7fff87de4aa8,0x7fc00c02d010,1) 14.06us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:24 WDiv:HOST:+0.000
MKL_VERBOSE SCOPY(1024,0x1b9a280,1,0x7fc00c02d010,1) 773ns CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:24 WDiv:HOST:+0.000
MKL_VERBOSE SSCAL(1024,0x7fff87de4aa8,0x7fc00c02d010,1) 389ns CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:24 WDiv:HOST:+0.000
MKL_VERBOSE SDOT(1024,0x1bf7d00,1,0x1bf8d10,1) 14.50us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:24 WDiv:HOST:+0.000
MKL_VERBOSE SDOT(1,0x1b2aa00,1,0x1b87c70,1) 905ns CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:24 WDiv:HOST:+0.000
Let me know if you need more information.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sujay,
According to you provided MKL verbose info, the MKL routines are already used successfully. I suppose, the problem may happen on some place when Caffe API calling scopy function with invalid attributes/pointer. Please check with your caffe program if you set any invalid/wrong attributes for some caffe APIs, or use GDB to debug finding the stack which processed by caffe lead to this runtime error.
Best regards,
Fiona

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page