Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted

mriedman

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-11-2019
02:07 AM

35 Views

I'm seeing slight numerical deviations between Sandy-Bridge and Haswell when I run the very same executable.

It is built with ifort17, plain Fortran, run with a single thread, not using MKL ot other external libs, linked with -static-intel, run on the same OS (RHEL7). Compile switches are quite aggressive but that does not explain (-O3 -xAVX -fp-model fast=2 -no-prec-div -no-prec_sqrt -ftz -fast-transcendentals). Sequential execution should exactly reproduce.

The usual suspect would then be uninitialized data. These are not detected by any of the known runtime checks. If they cause the deviation then I should be able to reproduce the deviation even with the same processor type. But I can't. I tried a number of different machines. v3 and v4 produce identical results on any machine.

So I can't see any further reason for such behaviour except maybe changes in the FPUs between v1 and v3. Is there any data or explanation ?

The deviations are really small however in my industry such behaviour is not easily accepted.

Accepted Solutions

Highlighted

Steve_Lionel

Black Belt Retired Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-11-2019
07:18 AM

35 Views

No, -fp-model has no effect on the math library. What you may want is -fimf-arch-consistency

--

Steve (aka "Doctor Fortran") - https://stevelionel.com/drfortran

Steve (aka "Doctor Fortran") - https://stevelionel.com/drfortran

6 Replies

jimdempseyatthecove

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-11-2019
05:13 AM

35 Views

At the beginning of your program add

USE, INTRINSIC :: IEEE_ARITHMETIC TYPE(IEEE_ROUND_TYPE) ROUND ... CALL IEEE_GET_ROUNDING_MODE(ROUND) ! Stores the rounding mode PRINT *, ROUND

Run on both CPU types to verify initialized rounding mode is the same

You may also want to call IEEE_SET_ROUNDING_MODE to your choice of modes: IEEE_DOWN, IEEE_NEAREST, IEEE_TO_ZERO, or IEEE_UP

Jim Dempsey

Highlighted

Steve_Lionel

Black Belt Retired Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-11-2019
05:48 AM

35 Views

The math library does CPU dispatching, which can result in small differences. See sc13.supercomputing.org/sites/default/files/WorkshopsArchive/pdfs/wp129s1.pdf for more details.

--

Steve (aka "Doctor Fortran") - https://stevelionel.com/drfortran

Steve (aka "Doctor Fortran") - https://stevelionel.com/drfortran

Highlighted

mriedman

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-11-2019
06:23 AM

35 Views

Thanks, gentlemen,

rounding mode is always IEEE_NEAREST.

@Steve, will -fp-model precise disable this CPU dispatching ?

Michael

Highlighted

Steve_Lionel

Black Belt Retired Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-11-2019
07:18 AM

36 Views

No, -fp-model has no effect on the math library. What you may want is -fimf-arch-consistency

--

Steve (aka "Doctor Fortran") - https://stevelionel.com/drfortran

Steve (aka "Doctor Fortran") - https://stevelionel.com/drfortran

Highlighted

mriedman

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-12-2019
04:27 AM

35 Views

Thanks, indeed that option resolved the issue.

The price is a 10% runtime increase. Compiled code is not affected but the intrinsics are (pow, log). They roughly double their runtime.

Anyway I now have an explanation and - if somebody insists - a solution.

Highlighted

Steve_Lionel

Black Belt Retired Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-12-2019
05:16 AM

35 Views

As I wrote in the presentation: Accuracy, Performance, Consistency: Pick two.

--

Steve (aka "Doctor Fortran") - https://stevelionel.com/drfortran

Steve (aka "Doctor Fortran") - https://stevelionel.com/drfortran

For more complete information about compiler optimizations, see our Optimization Notice.