MKL gemm results when multithreading

FedyuninV · ‎10-05-2021

Good day!

I haven't found this info in documentation, that's why I'm asking here.

Does gemm operation guarantee that the same call (same input data/sizes) return absolutely same result when only OMP_NUM_THREADS differs between launches? (OMP_NUM_THREADS=1,2,...)

Is it possible to get different results between sequential and non-sequential versions (inputs are absolutely the same)?

Do these rules apply to older MKL versions? (2018/19/20)

VidyalathaB_Intel · ‎10-06-2021

Hi,

Thanks for reaching out to us.

>>Is it possible to get different results between sequential and non-sequential versions (inputs are absolutely the same)?

We have tried executing a sample code using gemm operation (dgemm) and observed that the results are the same in both sequential and non-sequential versions.

The results are unaffected even while launching OMP_NUM_THREADS with different values like what you have mentioned in your query (OMP_NUM_THREADS=1,2,3..).

Below is the link from where we have taken the sample code.

https://software.intel.com/content/www/us/en/develop/documentation/mkl-tutorial-c/top/measuring-effect-of-threading-on-dgemm.html

If you have observed any differences in results, please do let us know. (provide us with a sample reproducer)

>>Do these rules apply to older MKL versions? (2018/19/20)

We tested the same code with older versions as well but the results are unchanged.

Could you please let us know your environment details (OS &version, MKL version, Compiler)?

Regards,

Vidya.

Kirill_V_Intel · ‎10-06-2021

Hi!

The answer by @VidyalathaB_Intel is not correct. There can be differences between sequential and multi-threaded versions. If you need bitwise reproducible results, please refer to the so called CNR mode present in oneMKL:

https://software.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/support-functions/conditional-numerical-reproducibility-control.html

and strict CNR mode (which is what you need I believe, "strict" refers to the dependence on the number of threads)
https://software.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/support-functions/conditional-numerical-reproducibility-control/reproducibility-conditions.html

Best,
Kirill

Spencer_P_Intel · ‎10-06-2021

Howdy,

So it turns out what you are asking about is what we in MKL call "conditional numerical reproducibility (CNR)", see https://software.intel.com/content/www/us/en/develop/articles/introduction-to-the-conditional-numerical-reproducibility-cnr.html. It is not guaranteed in MKL for all optimized code paths, however there is an environment variable or API you can set/use in your application code that can enable it (essentially taking different code paths designed for this feature) and which can lead to results being consistent from run to run for a fixed number of threads being used. Documentation is here: https://software.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/support-functions/conditional-numerical-reproducibility-control.html

Additionally, there is a "Strict Conditional Numerical Reproducibility" mode for a subset of MKL APIs which allow for the number of threads to be changed and still get the exact same results. See https://software.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/support-functions/conditional-numerical-reproducibility-control/reproducibility-conditions.html for more details.

Best,

Spencer

VidyalathaB_Intel · ‎10-07-2021

Apologies for the incorrect reply from my side. Thanks, @Kirill_V_Intel for pointing it out. There can be differences between sequential and multi-threaded versions.

VidyalathaB_Intel · ‎10-13-2021

Hi,

Reminder:

Has the information provided above helped? If yes, could you please confirm whether we can close this thread from our end?

Regards,

Vidya.

FedyuninV · ‎10-14-2021

Yes. Thanks for the clarifications! Thread can be closed now.

VidyalathaB_Intel · ‎10-18-2021

Hi,

Thanks for the confirmation.

As the issue is resolved we are closing this thread. Please post a new question if you need any additional information from Intel as this thread will no longer be monitored.

Regards,

Vidya.