Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

support fma

yuriisig
Beginner
2,153 Views

when optimization by means of fma of instructions is realized?

0 Kudos
7 Replies
TimP
Honored Contributor III
2,153 Views

a) target architecture includes fma, e.g. -mavx2, -xHost on avx2 platform, -mmic, -arch:AVX2 (or past IA64, PPC, mips)

b) floating point multiply-add sequence e.g. a*b+c, c-a*b

c) compiler may analyze latencies; e.g. IA dot product needs higher degree of riffling to maximize fma performance

0 Kudos
yuriisig
Beginner
2,153 Views

Points a) and b) I know. I didn't understand point c). The compiler has command /Qfma, but it doesn't influence the generated code. My processor Intel® Core™ i7-5820K

.

0 Kudos
JVanB
Valued Contributor II
2,153 Views

I always use /QxCORE-AVX2 to get ifort to output FMA instructions for my Core i7-4770.

 

0 Kudos
yuriisig
Beginner
2,153 Views

>I always use /QxCORE-AVX2 to get ifort to output FMA instructions for my Core i7-4770.

You watched an assembly code? There are fma commands?

 

0 Kudos
JVanB
Valued Contributor II
2,153 Views

Most definitely /QxCORE-AXV2 generated assembly code with FMA instructions. I have posted disassemblies lately in this forum.

0 Kudos
yuriisig
Beginner
2,153 Views

>Most definitely /QxCORE-AXV2 generated assembly code with FMA instructions.

I passed the test: the FMA commands appeared. Thanks.

0 Kudos
TimP
Honored Contributor III
2,153 Views

/Qfma will have no effect in ifort unless core-avx2 target is set (often done by QxHost).  Then it will be set by default; /Qfma- might be used to prevent use of fma.  As I mentioned, ifort might still choose not to use fma in situations where the longer latency will reduce performance.

I don't know why ifort would not accept /arch:AVX2 when some other compilers require that rather than /arch:CORE-AVX2.

0 Kudos
Reply