Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Not using FMA?

rudi-gaelzer
New Contributor I
830 Views

Using:

Intel(R) Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 16.0.3.210 Build 20160415

Processor: Intel(R) Core(TM) i7-4960X

Fedora Linux 24

According to Intel's specifications for this processor in:

http://ark.intel.com/products/77779/Intel-Core-i7-4960X-Processor-Extreme-Edition-15M-Cache-up-to-4_00-GHz

it suports Instruction Set Extensions SSE4.2, AVX, AES

Hence, according to the Intel® Fortran Compiler 16.0 User and Reference Guide, if I compile with "-fma" and "-march=core-avx2" or "-xHost" the compiler should create code with fused multiply-add (FMA) instructions.

I decided to test if this is the case for me.  I've found a simple test program in:

https://www.pgroup.com/lit/articles/insider/v3n3a4.htm

which I adapted as

program testfma3
implicit none
double precision :: a, b, c, d

!a = Z'3c54c9b71a0e6500'       !  4.507E-018
a = z'bF1A28A5F3777D60'
b = Z'bf43a04556d864ae'       ! -5.989E-004
c = Z'bfc55364b6b08299'       ! -0.166

d = 0.0d0
d= a + b*c
write(6,100) "Result: ",d,"(",d,")"

100 format (" ",a15,Z,a1,e22.16,a1)
end program testfma3

I compiled the code with

ifort -march=core-avx2 -fma testfma3.f90 -o testfma3x

and

ifort -xHost -fma testfma3.f90 -o testfma3x

In both cases I got the result:

Result:                       0(0.0000000000000000E+00)


when, according to the test, I should have obtained

BBBAD89127ADE008(-.5684854190555145E-20)

Does it mean that my processor does not generate FMA instructions after all?

Thanks.

0 Kudos
5 Replies
TimP
Honored Contributor III
830 Views

As your CPU supports AVX, but not AVX2, there is no FMA. 

You might check to see whether your floating point expressions are evaluated at compile time, particularly as it seems the compiler didn't try to generate an AVX2 instruction when so requested.

Also, it appears that 0.0 may be the expected result, FMA or not.

0 Kudos
rudi-gaelzer
New Contributor I
830 Views

Oh, OK.  Thanks.

Was the AVX2 instruction set implemented only from generation 5 processors?

How exactly do I "check to see whether your floating point expressions are evaluated at compile time"?

Tim P. wrote:

As your CPU supports AVX, but not AVX2, there is no FMA. 

You might check to see whether your floating point expressions are evaluated at compile time, particularly as it seems the compiler didn't try to generate an AVX2 instruction when so requested.

0 Kudos
TimP
Honored Contributor III
830 Views

When I set -S option to generate asm code, it appears that the result was evaluted at compile time, as there are no floating point instructions.

0 Kudos
rudi-gaelzer
New Contributor I
830 Views

Tim P. wrote:

Also, it appears that 0.0 may be the expected result, FMA or not.

Not according to the text I cited.  But I cannot verify, as I don't have access to FMA...

0 Kudos
TimP
Honored Contributor III
830 Views

The paper you referred to quoted 0.0 as the expected result without FMA, with rounding mode set to "up."  As you didn't set rounding mode, it is required to default to "nearest."  I don't know whether there any set requirements on rounding mode and precision for compile-time evaluation.

0 Kudos
Reply