Intel® oneAPI DPC++/C++ Compiler
Talk to fellow users of Intel® oneAPI DPC++/C++ Compiler and companion tools like Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and Intel® Distribution for GDB*

NaN problems continue to persist

janez-makovsek
New Contributor I
3,253 Views

Dear Sir/Madam,


Here you have one report:


https://community.intel.com/t5/Intel-C-Compiler/Armadillo-Intel-icpx-and-NaN-s/m-p/1397322#M40132

 

And it seems that also in the latest Intel DPC++ this problem is still not fixed. This:

 

double p*;

...

if (p[i] == 0)

{

}

will still be "true", if the p[i] contains NaN and optimization is O2. It fails for for all code paths AVX512, AVX2, etc...  and for double and float precision.

 

Disabling optimizations for the entire code in some cases is not really an option.

 

Kind Regards!
Atmapuri

0 Kudos
1 Solution
SantoshY_Intel
Moderator
2,835 Views

Hi,

 

When using fp-model=fast, which is the default in icx, the compiler assumes that there are no NaN values present in the compilation unit for both arguments and results. This behavior is also present in gcc and clang but not in icc.

Consider the code below:

#include <cmath> 
#include <iostream> 
#include <limits> 

void f(double *d) {
   *d = std::numeric_limits<double>::quiet_NaN();
} 

int main() {
   int i;
   double buf[4];
   for (i = 0; i < 4; ++i)
     f(&buf[i]);
   for (i = 0; i < 4; ++i)
     std::cout << "buf[" << i << "] = " << buf[i] << "\n";
   for (i = 0; i < 4; ++i)
     if (std::isnan(buf[i]))
       std::cout << "buf[" << i << "] is NaN\n";
     else
       std::cout << "buf[" << i << "] is not NaN\n";
   return 0;
}

 

When compiling this code with icx, clang, or gcc using "-O2 -ffast-math", the output is:

buf[0] = nan
buf[1] = nan
buf[2] = nan
buf[3] = nan
buf[0] is not NaN
buf[1] is not NaN
buf[2] is not NaN
buf[3] is not NaN 

 

This behavior is because the call to std::numeric_limits<double>::quiet_NaN() returns a hard-coded value, which is not modified by the compiler before being stored. The call to stream this value to std::cout also does not modify it. Therefore, the streamer prints "nan". However, the code that calls std::isnan() is optimized by the compiler, which assumes that the program contains no NaN values due to the fast-math option. Therefore, it optimizes this check away, resulting in the program printing "buf[0] is not NaN".

 

This behavior is expected and intended, although it may not match your expectations. The icc compiler does not make the no-NaN assumption with fp-model=fast, and therefore, it does not have this problem. It produces different output for the example above, which is more consistent with what you would expect to see.

 

For the ifx compiler, we do not make the no-NaN assumption unless the user explicitly requests it. This is because the Fortran standard requires NaN comparisons to be respected. For icx and dpcpp, we decided to align our behavior with clang and gcc.

 

If you want the icc behavior, you can use the -fhonor-nan-compares option (or -fhonor-nans, which is an alias that is also accepted by clang but not gcc). This option will have a minor performance impact, but not nearly as much as using fp-model precise. If you want gcc-compatible options, you can use -fno-finite-math-only, but this option also stops the compiler from assuming no infinity values, so it will have a slightly larger performance impact.

 

Thanks & Regards,

Santosh

 

View solution in original post

0 Kudos
8 Replies
SantoshY_Intel
Moderator
3,226 Views

Hi,

 

Thanks for posting in the Intel communities.

 

>>>"And it seems that also in the latest Intel DPC++ this problem is still not fixed."

Optimizations increase speed but may affect the accuracy or reproducibility of floating-point computations. This is an expected behavior while enabling optimizations.

 

>>>"Disabling optimizations for the entire code in some cases is not really an option."

We need not disable the optimizations. Instead, we can make use of options such as: -fp-model=precise, while enabling the optimizations as mentioned in the article.

 

 

Thanks & Regards,

Santosh

 

0 Kudos
janez-makovsek
New Contributor I
3,219 Views

Dear Santosh,

 

>This is an expected behavior while enabling optimizations.

 

In general yes, but not in the sense of a regression in compare to the classic C++ compiler from which we still try to migrate, but DPC++ is still too buggy for this.

 

>Instead, we can make use of options such as: -fp-model=precise

 

And you guarantee that enabling this switch does not decrease performance? Then why it is not on by default?


Thanks!
Atmapuri

0 Kudos
SantoshY_Intel
Moderator
3,169 Views

Hi,


Could you please provide us with a sample reproducer code so that we can try reproducing the issue from our end?


Thanks & Regards,

Santosh


0 Kudos
janez-makovsek
New Contributor I
3,160 Views

Dear Santosh,

 

The problem is conceptual. If this switch:

-fp-model=precise

is not enabled by Intel by default, the question is of course: why?

 

The assumed answer would be, because it will affect performance.

How much performance and where?

If we were to make a study on this on our own code, it would be a lot of work. If done separately for every compiler switch, the costs could be higher than Intels budget allocated for DPC++ development.

 

One of the main challenges with DPC++ compiler in the past was that it was no able to reach the performance levels of the Intel classic compiler (on average across millions of lines of code). If you now request special switches for otherwise normal operations, which should work out of the box, this does not look very reassuring. 

 

Thanks!
Atmapuri

 

0 Kudos
SantoshY_Intel
Moderator
2,972 Views

Hi,

 

Thanks for reporting this issue. We were able to reproduce it and we have informed the development team about it.

 

Best Regards,

Santosh

 

0 Kudos
SantoshY_Intel
Moderator
2,836 Views

Hi,

 

When using fp-model=fast, which is the default in icx, the compiler assumes that there are no NaN values present in the compilation unit for both arguments and results. This behavior is also present in gcc and clang but not in icc.

Consider the code below:

#include <cmath> 
#include <iostream> 
#include <limits> 

void f(double *d) {
   *d = std::numeric_limits<double>::quiet_NaN();
} 

int main() {
   int i;
   double buf[4];
   for (i = 0; i < 4; ++i)
     f(&buf[i]);
   for (i = 0; i < 4; ++i)
     std::cout << "buf[" << i << "] = " << buf[i] << "\n";
   for (i = 0; i < 4; ++i)
     if (std::isnan(buf[i]))
       std::cout << "buf[" << i << "] is NaN\n";
     else
       std::cout << "buf[" << i << "] is not NaN\n";
   return 0;
}

 

When compiling this code with icx, clang, or gcc using "-O2 -ffast-math", the output is:

buf[0] = nan
buf[1] = nan
buf[2] = nan
buf[3] = nan
buf[0] is not NaN
buf[1] is not NaN
buf[2] is not NaN
buf[3] is not NaN 

 

This behavior is because the call to std::numeric_limits<double>::quiet_NaN() returns a hard-coded value, which is not modified by the compiler before being stored. The call to stream this value to std::cout also does not modify it. Therefore, the streamer prints "nan". However, the code that calls std::isnan() is optimized by the compiler, which assumes that the program contains no NaN values due to the fast-math option. Therefore, it optimizes this check away, resulting in the program printing "buf[0] is not NaN".

 

This behavior is expected and intended, although it may not match your expectations. The icc compiler does not make the no-NaN assumption with fp-model=fast, and therefore, it does not have this problem. It produces different output for the example above, which is more consistent with what you would expect to see.

 

For the ifx compiler, we do not make the no-NaN assumption unless the user explicitly requests it. This is because the Fortran standard requires NaN comparisons to be respected. For icx and dpcpp, we decided to align our behavior with clang and gcc.

 

If you want the icc behavior, you can use the -fhonor-nan-compares option (or -fhonor-nans, which is an alias that is also accepted by clang but not gcc). This option will have a minor performance impact, but not nearly as much as using fp-model precise. If you want gcc-compatible options, you can use -fno-finite-math-only, but this option also stops the compiler from assuming no infinity values, so it will have a slightly larger performance impact.

 

Thanks & Regards,

Santosh

 

0 Kudos
janez-makovsek
New Contributor I
2,829 Views

Dear Santosh,


This is a very qualified and exact answer. I appreciate your efforts very much.
Thank you for having a look at this.

 

Kind Regards!
Atmapuri

0 Kudos
SantoshY_Intel
Moderator
2,823 Views

Hi,


Thanks for accepting our solution. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.


Thanks & Regards,

Santosh


0 Kudos
Reply