topic Re: MKL numerical stability and threading in Intel® oneAPI Math Kernel Library

MKL numerical stability and threading

millidred — Tue, 24 Nov 2009 13:31:18 GMT

Hello

I am wondering how/whether the number of threads affect the numerical results computed by MKL.

Section 8.1 of the MKL (version 10.1) user's guide states that

"With a given Intel MKL version, the outputs will be bit-for-bit identical provided all the following conditions are met:
the outputs are obtained on the same platform;
the inputs are bit-for-bit identical;
the input arrays are aligned identically at 16-byte boundaries."

Does this really mean, that I can link the threaded MKL libraries and vary the value of OMP_NUM_THREADS and still expect bit-for-bit identical results given the above conditions are met?

Thanks for you comments!

Re: MKL numerical stability and threading

Dmitry_B_Intel — Tue, 24 Nov 2009 15:52:46 GMT

Hello,

Yet another condition should be met: MKL shall be run in sequential mode. Generally, precise result of threaded algorithmsmay depend not only on the number of threads but also onthe order in whichthe threads are executed. For example, specification of OpenMP states: "...comparing one parallel run to another (even if the number of threads used is the same), there is no guarantee that bit-identical results will be obtained...".

Thanks
Dima

Re: MKL numerical stability and threading

dbacchus — Tue, 24 Nov 2009 17:15:03 GMT

Quoting - Dmitry Baksheev (Intel)

Dima, could you please provide a link or reference to the abovementioned OpenMP specification? It makes a lot of sense, of course:e.g.,if onecalculates a product of many variables (in a parallel loop), the result will depend on the order of multiplication due to the truncation errors.

Re: MKL numerical stability and threading

TimP — Tue, 24 Nov 2009 17:19:15 GMT

Quoting - dbacchus

OpenMP is expected to produce numerical variations when reduction operators are in use.
The quotation appears in OpenMP standard http://www.openmp.org/mp-documents/spec30.pdf pg 98

Re: MKL numerical stability and threading

dbacchus — Tue, 24 Nov 2009 17:39:05 GMT

Thanks, tim18!

Re: MKL numerical stability and threading

TimP — Tue, 24 Nov 2009 18:22:49 GMT

Quoting - tim18

OpenMP is expected to produce numerical variations when reduction operators are in use.
The quotation appears in OpenMP standard http://www.openmp.org/mp-documents/spec30.pdf pg 98

By the way, and off the current topic, much more severe variations are observed in certain implementations of MPI reduction operators. The implementors of Intel MPI (and apparently openmpi) have achieved satisfactory results, so it seems to be treated as a Quality of Implementation issue rather than a standards question.
Certain hybrid OpenMP/MPI applications have options to bypass OpenMP reduction so as to permit changing the number of threads (but not number of MPI processes), without producing numerical variations. Typically, there is a significant performance penalty involved in avoiding OpenMP or MPI reductions.

Re: MKL numerical stability and threading

yuriisig — Tue, 24 Nov 2009 19:48:41 GMT

Quoting - tim18

It is possible to result a concrete example for OpenMP?

Re: MKL numerical stability and threading

TimP — Tue, 24 Nov 2009 20:04:47 GMT

Quoting - yuriisig

It is possible to result a concrete example for OpenMP?

Do you mean an example of a commercial application which offers the user a choice of alternate code paths with or without OpenMP reduction? LS-DYNA/SMP and LS-DYNA/hybrid offer such options.

Re: MKL numerical stability and threading

yuriisig — Tue, 24 Nov 2009 21:19:50 GMT

Quoting - tim18

LS-DYNA/SMP and LS-DYNA/hybrid offer such options.

Easier LS-DYNA something exists?

Re: MKL numerical stability and threading

TimP — Tue, 24 Nov 2009 22:01:21 GMT

Do you mean how does the source code give a choice of OpenMP reduction or no reduction?
// read user option, set/reset omp_reduction_ok
#pragma omp for reduction(+:sumall) if(omp_reduction_ok)
...

Compiler options which prevent vectorized sum reduction might also be set, if there is no control over data alignment, e.g. /fp:source for Intel Windows compilers, omit -ffast-math for gnu compilers. There is no need for alignment dependent code on recent CPUs like Barcelona, Core i7, .... but compilers tend to do it so as to optimize for earlier CPUs.

Re: MKL numerical stability and threading

millidred — Wed, 25 Nov 2009 09:52:56 GMT

Quoting - Dmitry Baksheev (Intel)

Hi Dima

It's strange... Using MKL 10.1 on intel64 I seem to get bit-identical results for 1 to 4 threads. I have tested e.g. the dnrm2 and dgemm BLAS functions.

Regards,
Roman

Re: MKL numerical stability and threading

TimP — Wed, 25 Nov 2009 14:26:36 GMT

Quoting - millidred

It's strange... Using MKL 10.1 on intel64 I seem to get bit-identical results for 1 to 4 threads. I have tested e.g. the dnrm2 and dgemm BLAS functions.

dgemm may not require reduction operators, when implemented efficiently. The reduction-like operation in dnrm2 may not involve roundoff variations with order of operations, even if it is OpenMP threaded. Anyway, it may be difficult to expose and test all opportunities for the variations which OpenMP standard warns about; this tells you only that there is no guarantee.

Re: MKL numerical stability and threading

Dmitry_B_Intel — Wed, 25 Nov 2009 17:03:56 GMT

Quoting - millidred

Hi Dima

It's strange... Using MKL 10.1 on intel64 I seem to get bit-identical results for 1 to 4 threads. I have tested e.g. the dnrm2 and dgemm BLAS functions.

Regards,
Roman

Hi Roman,

You've been lucky to get bit-to-bit reproducible results in your dgemm tests.Function dnrm2 is not parallel in that version of MKL, so no surprise. I attach a dgemm tests that would fail on if run long enough.

Thanks
Dima

Re: MKL numerical stability and threading

millidred — Thu, 26 Nov 2009 14:10:06 GMT

Quoting - Dmitry Baksheev (Intel)

Hi Dima

Thanks for your test code. It clearly shows, that the parallel dgemm routine does not produce bit-to-bit identical results. I must have been lucky indeed in my tests.

Regards,
Roman