Intel® oneAPI DPC++/C++ Compiler
Talk to fellow users of Intel® oneAPI DPC++/C++ Compiler and companion tools like Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and Intel® Distribution for GDB*
855 Discussions

Bug report: Incorrect FMA emission on non-FMA-capable targets.

piotrtopnotch
Beginner
137 Views
x86-64 icx 2025.3.1 and older versions.
 
With just "-O2" the following program:
 
#include <cmath>

double fn_test(double x, double y, double z) {
    return std::fma(x, y, z);
}​
 
is translated into:
 
fn_test(double, double, double):
        mulsd   xmm0, xmm1
        addsd   xmm0, xmm2
        ret​
 
which is totally wrong. Per the C++ std::fma specification, "Computes * y + z as if to infinite precision and rounded only once to fit the result type."
 
The infinite precision is being lost with the sequence listed above. The compiler should emit a call to a library function that emulates FMA correctly on the targets missing that instruction, as e.g. clang does:
 
fn_test(double, double, double):
        jmp     fma@PLT
​
 
Severity: very high, as techniques like double-double immediately fail due to incorrect rounding.
 
 
 

 

0 Kudos
2 Replies
Viet_H_Intel
Moderator
95 Views

This is due to the fact that floating-point calculations use different default settings than Clang.

By default, icx enables -fp-model=fast. Compiling with -fp-model=precise, you would see:

         jmp fma@PLT # TAILCALL

 

Likewise, if you compile with clang -O2 -ffp-model=fast you would see:

       mulsd   xmm0, xmm1
        addsd   xmm0, xmm2
        ret
0 Kudos
piotrtopnotch
Beginner
77 Views

Thank you. This default is very dangerous, but otherwise your remark is valid.  

0 Kudos
Reply