- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

The C and Fortran compilers will generate FMA instructions from ordinary source code loops for which the FMA operation is appropriate, provided that

- the target instruction set includes FMA (AVX2 or newer -- note that the default is SSE, which does not support FMA), and
- the optimization level is high enough (at least O1, but preferably O2), and
- you have not prohibited FMA with a different compiler flag (-no-fma or some options to the -fp-model flag).

There are a few cases involving reduction operations where the compiler will choose not to use FMA operations because it estimates that there will be a shorter critical path by splitting the operation (doing the multiplication earlier and the addition later).

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

No, Tim. in ?axpy a is a scalar, not a vector and there is no vector b.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

The C and Fortran compilers will generate FMA instructions from ordinary source code loops for which the FMA operation is appropriate, provided that

- the target instruction set includes FMA (AVX2 or newer -- note that the default is SSE, which does not support FMA), and
- the optimization level is high enough (at least O1, but preferably O2), and
- you have not prohibited FMA with a different compiler flag (-no-fma or some options to the -fp-model flag).

There are a few cases involving reduction operations where the compiler will choose not to use FMA operations because it estimates that there will be a shorter critical path by splitting the operation (doing the multiplication earlier and the addition later).

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page