- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
a) target architecture includes fma, e.g. -mavx2, -xHost on avx2 platform, -mmic, -arch:AVX2 (or past IA64, PPC, mips)
b) floating point multiply-add sequence e.g. a*b+c, c-a*b
c) compiler may analyze latencies; e.g. IA dot product needs higher degree of riffling to maximize fma performance
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Points a) and b) I know. I didn't understand point c). The compiler has command /Qfma, but it doesn't influence the generated code. My processor Intel® Core™ i7-5820K
.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I always use /QxCORE-AVX2 to get ifort to output FMA instructions for my Core i7-4770.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>I always use /QxCORE-AVX2 to get ifort to output FMA instructions for my Core i7-4770.
You watched an assembly code? There are fma commands?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Most definitely /QxCORE-AXV2 generated assembly code with FMA instructions. I have posted disassemblies lately in this forum.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>Most definitely /QxCORE-AXV2 generated assembly code with FMA instructions.
I passed the test: the FMA commands appeared. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
/Qfma will have no effect in ifort unless core-avx2 target is set (often done by QxHost). Then it will be set by default; /Qfma- might be used to prevent use of fma. As I mentioned, ifort might still choose not to use fma in situations where the longer latency will reduce performance.
I don't know why ifort would not accept /arch:AVX2 when some other compilers require that rather than /arch:CORE-AVX2.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page