Community
cancel
Showing results for 
Search instead for 
Did you mean: 
AndrewC
New Contributor I
68 Views

CSYTRF/CSYTRS give different results 1 thread vs 8 threads (MKL 2019u5)

THREADS=8
first 10 values of the Solution
           1 (-104034.0,20268.75)
           2 (-104099.4,20310.57)
           3 (-103879.3,20264.78)
           4 (-103980.5,20282.24)
           5 (-104128.0,20316.93)
           6 (-103976.7,20282.51)
           7 (-104120.5,20318.15)
           8 (-103958.9,20275.50)
           9 (-104034.3,20268.56)
          10 (-104085.2,20310.74)

THREADS=1

  first 10 values of the Solution
           1 (-104393.1,20376.32)
           2 (-104459.2,20418.33)
           3 (-104238.2,20372.31)
           4 (-104339.8,20389.88)
           5 (-104488.0,20424.71)
           6 (-104336.1,20390.15)
           7 (-104480.5,20425.96)
           8 (-104318.0,20383.11)
           9 (-104393.3,20376.11)
          10 (-104445.1,20418.51)

This is being tested with MKL 2019.5, Visual Studio 2017 64-bit compiler.

The test matrix is in the .zip file and needs to be unzipped into the directory of the executable.

 

0 Kudos
6 Replies
Gennady_F_Intel
Moderator
68 Views

thanks, Andrew for the report, we will check it asap.

Gennady_F_Intel
Moderator
68 Views

yes, I see the problem with the latest 2019 u5 and we will escalate the issue.

AndrewC
New Contributor I
68 Views

Thanks!

AndrewC
New Contributor I
68 Views

Hi Gennady,

Was this confirmed as an issue, and is there going to be a fix at some point?

 

Andrew

Gennady_F_Intel
Moderator
68 Views

Andrew,

we could not confirm that the reported case is the bug for Intel MKL. The MKL LAPACK routines cannot guarantee bitwise reproducible results even in the strict CNR mode (see the KB article at https://software.intel.com/en-us/articles/introduction-to-the-conditional-numerical-reproducibility-cnr). Since that and because of unavoidable round-off errors and different order of arithmetic operations in sequential and parallel code branches, the deviation in solutions observed by the user should be expected.

Gennady

AndrewC
New Contributor I
68 Views

Well... I realize that we not should expect bit-for-bit identical results with threads=1, vs threads=8, but the differences are more significant that I would expect. I do not see differences in other MKL routines of a similar magnitude ( 3rd or 4th significant figure) when varying the number of threads.

Reply