- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

In this case results cannot change if initial condition and constraints are the same ...

this is what happen in our implementation.

Thank you

Gianluca

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hi Gianluca，

i write one email to you today, hope it disclose the inconsistent problem.

Theoretically , besides initial condition and constraints are the same, the run-time executable order are same, the implemented Trust Region method is deterministic.

Best Regards,

Ying

Some discussion in MKL user guide about the floating-point computation and MKL implementation:

Intel® Math Kernel Library (Intel® MKL) offers functions and environment variables that help you obtain

Conditional Numerical Reproducibility (CNR) of floating-point results when calling the library functions from

your application. These new controls enable Intel MKL to run in a special mode, when functions return bitwise

reproducible floating-point results from run to run under the following conditions:

• Calls to Intel MKL occur in a single executable

• The number of computational threads used by the library does not change in the run

It is well known that for general single and double precision IEEE floating-point numbers, the associative

property does not always hold, meaning (a+b)+c may not equal a +(b+c). Let's consider a specific example.

In infinite precision arithmetic 2-63 + 1 + -1 = 2-63. If this same computation is done on a computer using

double precision floating-point numbers, a rounding error is introduced, and the order of operations becomes

important:

(2-63 + 1) + (-1) ≃ 1 + (-1) = 0

versus

2-63 + (1 + (-1)) ≃ 2-63 + 0 = 2-63

This inconsistency in results due to order of operations is precisely what the new functionality addresses.

The application **related factors that affect the order of floating-point operations within a single executable
program include selection of a code path based on run-time processor dispatching, alignment of data arrays,
variation in number of threads, threaded algorithms and internal floating-point control settings**. You can

control most of these factors by controlling the number of threads and floating-point settings and by taking

steps to align memory when it is allocated (see the Getting Reproducible Results with Intel® MKL knowledge

base article for details). However, run-time dispatching and certain threaded algorithms do not allow users to

make changes that can ensure the same order of operations from run to run.

Intel MKL does run-time processor dispatching in order to identify the appropriate internal code paths to

traverse for the Intel MKL functions called by the application. The code paths chosen may differ across a wide

range of Intel processors and Intel architecture compatible processors and may provide differing levels of

performance. For example, an Intel MKL function running on an Intel® Pentium® 4 processor may run one

code path, while on the latest Intel® Xeon® processor it will run another code path. This happens because

each unique code path has been optimized to match the features available on the underlying processor. One

key way that the new features of a processor are exposed to the programmer is through the instruction set

architecture (ISA). Because of this, code branches in Intel MKL are designated by the latest ISA they use for

optimizations: from the Intel® Streaming SIMD Extensions 2 (Intel® SSE2) to the Intel® Advanced Vector

Extensions 2 (Intel® AVX2). The feature-based approach introduces a challenge: if any of the internal

floating-point operations are done in a different order or are re-associated, the computed results may differ.

Dispatching optimized code paths based on the capabilities of the processor on which the code is running is

central to the optimization approach used by Intel MKL. So it is natural that consistent results require some

performance trade-offs. If limited to a particular code path, performance of Intel MKL can in some

circumstances degrade by more than a half. To understand this, note that matrix-multiply performance

nearly doubled with the introduction of new processors supporting Intel AVX2 instructions. Even if the code

branch is not restricted, performance can degrade by 10-20% because the new functionality restricts

algorithms to maintain the order of operations.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

ok Thank you, now it is clear.

BR

Gianluca

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

For developers, please see the summary post at

https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/737444

Best Regards,

Ying

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page