Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Some digits differ from serial to distibuted Cholesky

Georgios_S_
New Contributor II
540 Views

I am using the double precision routine in C for Cholesky. In order to check the result, I am running the serial version too. For smaller inputs, the results are same, but for a matrix 6500x6500, I am getting some digits differently, should I worry?

Here is the last numbers of the results that differdiff.png:

 

0 Kudos
1 Solution
Ying_H_Intel
Employee
540 Views

Hi George, 

Right, your program should not have a bug and the results are acceptable. 

As the digit differ from serial to distribution.  In most of case, the behavior was acceptable as explained in the KB article: https://software.intel.com/en-us/articles/getting-reproducible-results-with-intel-mkl

And we also introduce the feature  of Conditional Numerical Reproducibility (CNR)  in  https://software.intel.com/en-us/articles/introduction-to-the-conditional-numerical-reproducibility-cnr. if you'd like keep bitwise consistant, you may try this feature.

Best Regards,

Ying  

View solution in original post

0 Kudos
8 Replies
TimP
Honored Contributor III
540 Views

A change of the least significant bit in the result could show up in the 16th decimal place, and is nearly certain to be well within the accuracy of your solution. Such changes are inevitable as a result of changing numbers of threads; some might be avoided by following the advice to assure 32-byte alignment of data, and more might be avoided (at the cost of performance) by invoking the conditional reproducibility feature of MKL.

0 Kudos
Georgios_S_
New Contributor II
540 Views

So, do you think that my code is OK or that it has some bug?

About the 32-byte alignment, how can this be achieved, or it is too broad? Also, do you have any tutorial on how to use the conditional reproducibility feature?

0 Kudos
TimP
Honored Contributor III
540 Views

If calling mkl from ifort, option -align array32byte should help data alignment. For c and c++, there are various gcc and msvc alignment qualifiers. Mkl conditional reproducibility is well covered in mkl docs. 

Such small changes aren't indicative of a program fault, but evidently there was enough interest in this to motivate development of conditional reproducibility options.

0 Kudos
Georgios_S_
New Contributor II
540 Views

That means that my program doesn't have a bug, that's what I needed Tom, thanks.

0 Kudos
Ying_H_Intel
Employee
541 Views

Hi George, 

Right, your program should not have a bug and the results are acceptable. 

As the digit differ from serial to distribution.  In most of case, the behavior was acceptable as explained in the KB article: https://software.intel.com/en-us/articles/getting-reproducible-results-with-intel-mkl

And we also introduce the feature  of Conditional Numerical Reproducibility (CNR)  in  https://software.intel.com/en-us/articles/introduction-to-the-conditional-numerical-reproducibility-cnr. if you'd like keep bitwise consistant, you may try this feature.

Best Regards,

Ying  

0 Kudos
Georgios_S_
New Contributor II
540 Views

Hi Ying,

  thanks for the good links. About the number of threads, I think this will apply to number of processes too, since I am running a distributed job and not a parallel one. Do you agree?

Best regards,

George

0 Kudos
Ying_H_Intel
Employee
540 Views

Hi George, 

Right, it should be number of processes when with a distribution job. 

Best Regards,

Ying 

0 Kudos
Georgios_S_
New Contributor II
540 Views

Hi Ying,

  thanks. That means that my project is likely to be ready (the professor seems to agree too). So, that might be my last post here. However, I hope I will be able to help any future user that has relative questions to my experience here.

Thanks for all the help and bravo for your good forum (which you make it good, the people!),

George

0 Kudos
Reply