Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

How to I determine at runtime what vector instructions are being used when compiling with -ax

crtierney42
New Contributor I
427 Views

In a few weeks, we will have another generation of Intel HPC system.  We will have systems that support SSE4.2 (Nehalem, Westmere), AVX (SandyBridge, IvyBridge), and CORE-AVX2 (Haswell) optimizations.  Since the compile nodes are being upgraded to Haswell as well, I want to tell the users to specify something different than -xHost when using Intel Fortran so binaries can be backwards compatible and run on any of the clusters.  I planned to tell the users to use -xSSE4.2 -axCORE-AVX2,AVX.

 

My questions are:

 

1) Is this the best way to compile codes so they can support different vectorization units depending on which system they run?

2) How do I confirm that the correct code path is being used based on the system used?  I have looked at the assembly and see that the instructions are there, but I want to know at runtime the code is being used.

 

Thanks,

Craig

0 Kudos
3 Replies
Steven_L_Intel1
Employee
427 Views

1) Yes, that's good advice

2) I don't think there's any supported method to determine that. You can certainly query the processor to see which instruction sets it supports, though the method of doing that is rather arcane. You could run the program under VTune on selected systems and look at instruction counts for specific regions.

0 Kudos
crtierney42
New Contributor I
427 Views

Steve,

Querying the processor isn't enough.  I need to know that the CORE-AVX2 specific instructions are being used, not the SSE4.2 instructions, when run on Haswell.

I will fire up VTUNE and see what it tells me.

Thanks, Craig

0 Kudos
Steven_L_Intel1
Employee
427 Views

Just keep in mind that the compiler decides if it is worthwhile generating a path with the AVX2 instructions. Depending on your code, it may choose not to do so.

Another thing you can do to test is run the program under gdb, set a breakpoint at the routine and step through instructions to see what it does. This might be a problem with high optimization and inlining, though.

0 Kudos
Reply