Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

CSYTRF/CSYTRS give different results 1 thread vs 8 threads (MKL 2019u5)

AndrewC
New Contributor III
392 Views

THREADS=8
first 10 values of the Solution
           1 (-104034.0,20268.75)
           2 (-104099.4,20310.57)
           3 (-103879.3,20264.78)
           4 (-103980.5,20282.24)
           5 (-104128.0,20316.93)
           6 (-103976.7,20282.51)
           7 (-104120.5,20318.15)
           8 (-103958.9,20275.50)
           9 (-104034.3,20268.56)
          10 (-104085.2,20310.74)

THREADS=1

  first 10 values of the Solution
           1 (-104393.1,20376.32)
           2 (-104459.2,20418.33)
           3 (-104238.2,20372.31)
           4 (-104339.8,20389.88)
           5 (-104488.0,20424.71)
           6 (-104336.1,20390.15)
           7 (-104480.5,20425.96)
           8 (-104318.0,20383.11)
           9 (-104393.3,20376.11)
          10 (-104445.1,20418.51)

This is being tested with MKL 2019.5, Visual Studio 2017 64-bit compiler.

The test matrix is in the .zip file and needs to be unzipped into the directory of the executable.

 

0 Kudos
6 Replies
Gennady_F_Intel
Moderator
392 Views

thanks, Andrew for the report, we will check it asap.

0 Kudos
Gennady_F_Intel
Moderator
392 Views

yes, I see the problem with the latest 2019 u5 and we will escalate the issue.

0 Kudos
AndrewC
New Contributor III
392 Views

Thanks!

0 Kudos
AndrewC
New Contributor III
392 Views

Hi Gennady,

Was this confirmed as an issue, and is there going to be a fix at some point?

 

Andrew

0 Kudos
Gennady_F_Intel
Moderator
392 Views

Andrew,

we could not confirm that the reported case is the bug for Intel MKL. The MKL LAPACK routines cannot guarantee bitwise reproducible results even in the strict CNR mode (see the KB article at https://software.intel.com/en-us/articles/introduction-to-the-conditional-numerical-reproducibility-cnr). Since that and because of unavoidable round-off errors and different order of arithmetic operations in sequential and parallel code branches, the deviation in solutions observed by the user should be expected.

Gennady

0 Kudos
AndrewC
New Contributor III
392 Views

Well... I realize that we not should expect bit-for-bit identical results with threads=1, vs threads=8, but the differences are more significant that I would expect. I do not see differences in other MKL routines of a similar magnitude ( 3rd or 4th significant figure) when varying the number of threads.

0 Kudos
Reply