- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
with the help of valgrind I noticed that LAPACKE_ssygvx calls LAPACKE_sge_nancheck on the input matrix.
I specified to LAPACKE_ssygvx to use only the lower triangular part of the symmetric matrix (the other part is not even initialized) but LAPACKE_sge_nancheck accesses also the upper traingular part and the algorithm returns an error if there is a NaN.
The attached program verifies this. I tested it with 2018.3.222 but also 2019.3.199, both version are affected.
I guessed MKLD-3999 (Fixed the issue LAPACKE_ssyevd fails when upper triangular part of the matrix is filled with random numbers) could be a fix but nope ...
Used compiler: g++ 8.1.0, Linux
valgrind output:
==28297== Conditional jump or move depends on uninitialised value(s)
==28297== at 0x4023C7: LAPACKE_sge_nancheck (in mkl_bug)
==28297== by 0x401FCA: LAPACKE_ssygvx (in mkl_bug)
==28297== by 0x401AE1: main (in mkl_bug)
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This issue has to be fixed into latest 2019 u3 and someone from our customer confirmed this fix. Thanks for reproducer, we will check.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
]$ ./a.out
A=1 -0.1 0.5
B=0.7 -0.3 0.6
MKL_VERBOSE Intel(R) MKL 2019.0 Update 3 Product build 20190125 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions (Intel(R) AVX) enabled processors, Lnx 2.80GHz lp64 intel_thread
MKL_VERBOSE SSYGVX(1,V,I,L,2,0x246f080,2,0x246f0a0,2,0x7ffd3b506068,0x7ffd3b506070,2,2,0x7ffd3b506078,0,0x7ffd3b506280,0x7ffd3b506288,2,0x7ffd3b506178,-1,0x2480280,0x7ffd3b506290,0) 58.57us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:20
MKL_VERBOSE SSYGVX(1,V,I,L,2,0x246f080,2,0x246f0a0,2,0x7ffd3b506068,0x7ffd3b506070,2,2,0x7ffd3b506078,1,0x7ffd3b506280,0x7ffd3b506288,2,0x2481300,16,0x2480280,0x7ffd3b506290,0) 278.46us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:20
A=1.42857 0.57197 1.2684
B=0.83666 -0.358569 0.686607
Found eigenvalues: 1
Return: 0
w(lambda)=1.92603
z(x)=1.31147 0.955791
ifail=0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have no valgrind available for the moment to check exactly your case. Do you see the run tine problem into your application with the latest 2019 u3?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Gennady F. (Blackbelt) wrote:I have no valgrind available for the moment to check exactly your case. Do you see the run tine problem into your application with the latest 2019 u3?
I have seen the problem also with version 2019.03.199.
You can verify it if you activate the code
B[2] = std::numeric_limits<float>::signaling_NaN();
In this case nothing is done (A and B are unchanged, no eigenvalue compution is performed) and the return value is -9.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
$ ./a.out
A=1 -0.1 0.5
B=0.7 -0.3 0.6
A=1 -0.1 0.5
B=0.7 -0.3 0.6
Found eigenvalues: 0
Return: -9
a.out: mkl_bug.cpp:60: int main(): Assertion `m == 1' failed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
do you see the similar result?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Gennady F. (Blackbelt) wrote:do you see the similar result?
I get the same result. You see that changing an entry (B[2] or A[2]) in the right, top triangel changes the result, but I told the function to use the lower part only ("L").
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much for your reproducer and thorough testing. You're right - this is an issue and will be fixed in an upcoming release of Intel MKL. In the meantime, if this is hampering your (or others') development, NaN checking can be disabled by setting the LAPACKE_NANCHECK environment variable to 0 or by calling the LAPACKE_set_nancheck function.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>mkl_bug.exe
A=1 -0.1 0.5
B=0.7 -0.3 0.6
MKL_VERBOSE Intel(R) MKL 2020.0 Product build 20191125 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 2 (Intel(R) AVX2) enabled processors, Win 2.60GHz cdecl intel_thread
MKL_VERBOSE SSYGVX(1,V,I,L,2,000001ACACF52660,2,000001ACACF52980,2,0000008FD24FF670,0000008FD24FF678,2,2,0000008FD24FF690,0,0000008FD24FF840,0000008FD24FF848,2,0000008FD24FF700,-1,000001ACACF57680,0000008FD24FF8 13.03us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:2
MKL_VERBOSE SSYGVX(1,V,I,L,2,000001ACACF52660,2,000001ACACF52980,2,0000008FD24FF670,0000008FD24FF678,2,2,0000008FD24FF690,1,0000008FD24FF840,0000008FD24FF848,2,000001ACACF79580,16,000001ACACF57680,0000008FD24FF8 266.67us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:2
A=1.42857 0.57197 1.2684
B=0.83666 -0.358569 0.686607
Found eigenvalues: 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
The fix of the issue available into the newest MKL v.2020. You could take this version and check the problem on your side.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page