topic Do ?axpy work for you? in Intel® oneAPI Math Kernel Library

How to use fused multiply–add with MKL?

Konovalov__Pavel — Tue, 12 Dec 2017 14:08:28 GMT

I want to do basic a*x + b operation, where a, x and b are the vectors(or matrixes) with utilization of FMA processor capabilities. I think I am using v?Mul + v?Add I will get two separate operations. How to use FMA with the help of MKL and Intel compiler? Must I use FMA Intrinsics only?

Do ?axpy work for you?

TimP — Tue, 12 Dec 2017 15:22:13 GMT

Do ?axpy work for you?

No, Tim. in ?axpy a is a

Konovalov__Pavel — Tue, 12 Dec 2017 16:04:11 GMT

No, Tim. in ?axpy a is a scalar, not a vector and there is no vector b.

The C and Fortran compilers

McCalpinJohn — Tue, 12 Dec 2017 19:13:39 GMT

The C and Fortran compilers will generate FMA instructions from ordinary source code loops for which the FMA operation is appropriate, provided that

the target instruction set includes FMA (AVX2 or newer -- note that the default is SSE, which does not support FMA), and
the optimization level is high enough (at least O1, but preferably O2), and
you have not prohibited FMA with a different compiler flag (-no-fma or some options to the -fp-model flag).

There are a few cases involving reduction operations where the compiler will choose not to use FMA operations because it estimates that there will be a shorter critical path by splitting the operation (doing the multiplication earlier and the addition later).

Thank you, John! Is there

Konovalov__Pavel — Tue, 12 Dec 2017 22:51:13 GMT

Thank you, John! Is there any indication from the compiler ouput that loop is FMAsed :) ? like with vectorisation report