Re: 2025.1 floating point issue

may_ka · ‎04-13-2025

Hi,

after installing the 2025.1 oneapi basekit and hpckit release I have stumbled over substantial differences in calculation results.

After some digging it turned out that the culprit is icpx.

Doing some comparisons for the respective calculations I found that g++ 14.2, clang++ 19.1, icpx 2024.0.2 all deliver the same results irrespective of the floating point model (i.e. irrespective of whether "-ffast-math" is set or not). icpx 2025.1 also delivers the same results but only when using "-O0 -g", or "-O3 -fno-fast-math".

Is this a bug ....... or is icpx 2025.1 requiring different flags?

The calculations involve multiplication, addition, subtraction, square root, power, and absolute value, where for the last there std::sqrt, std::pow and std::abs is used.

Any idea?

Thanks

OS: Linux

Kernel: 6.14

Sravani_K_Intel · ‎04-16-2025

Could you provide a test case that demonstrates the differences in results?

may_ka · ‎05-03-2025

Hi.

I tried hard but could not isolate the problem into a small reproducer. My current code base is >100,000 lines. However, I got to the point that there is something wrong with loop optimization.

What I have found so far is that options "-O0", "-O1" and "-O2" without any "fp-model" specification do not exhibit the same problem. My understanding is that "-fp-model=fast" is the default invariably of the "-O" setting.

However "-O2 -qopenmp" produces wrong results, but "-O2 -qopenmp -fp-model=precise" produces the correct results.

I was wondering whether I can run test with "-O2" plus successively all options which "-O3" sets.

I have tried to identify all options which "-O3" is setting, but the manual says nothing about it:

"Performs O2 optimizations and enables more aggressive loop
transformations such as Fusion, Block-Unroll-and-Jam, and collapsing IF
statements.
This option may set other options. This is determined by the compiler,
depending on which operating system and architecture you are using. The
options that are set may change from release to release."

Maybe you can provide those options.

Sravani_K_Intel · ‎05-05-2025

The default behavior of fp-model=fast in the oneAPI C/C++ compiler 2024.2 has been updated to honor NaNs and infinities. This is a change from versions 2024.1 and earlier, where fp-model=fast assumed NaN and infinite values would not appear as inputs or outputs. Can you try the options -fno-honor-nans -fno-honor-infinities to see if that helps your case?

may_ka · ‎05-06-2025

Hi,

I tried with compile flags

icpx -march=cascadelake -std=c++20 -fPIE -std=gnu++20 -ferror-limit=4 -frelaxed-template-template-args -O3 -qopenmp -fno-honor-nans -fno-honor-infinities

and

icpx -march=cascadelake -std=c++20 -fPIE -std=gnu++20 -ferror-limit=4 -frelaxed-template-template-args -O3 -fno-honor-nans -fno-honor-infinities

and

icpx -march=cascadelake -std=c++20 -fPIE -std=gnu++20 -ferror-limit=4 -frelaxed-template-template-args -g -D_GLIBCXX_DEBUG_PEDANTIC -g -O0 -fp-model=precise -c

where 1 and 2 produce results different from 3 and using g++ and clang++. I.e. your proposed flags did not resolve the issue.

Sravani_K_Intel · ‎05-06-2025

Thanks for trying. As for the fp-model, the above-mentioned change is the only one that was made in 2024.2 release. We will need a test case to further investigate the behavior you are observing.