I need to build with option -axAVX using Intel compiler(ICPC C++ compiler),so that i can run on AVX supported Intel processor and also in non-AVX Intel processors.I was able to build successfully,but the the executable crashes(Segmentation Fault) when I run on Non-AVX processor.When i debugged i could see that the crash occurred in one of the automatically dispatched function.Without -axAVX the executable works fine.
My assumption is
generate code specialized for processors specified by <codes>
while also generating generic IA-32 instructions.
My machine doesn't support AVX build,so the generic version should do the job,but doesn't wok for me.
Any thought on this?Please do help me in resolving this.
It may be interesting to know more specifics about the pre-AVX target as well as to see whether your debug session can reveal whether the crash occurs on a specific asm instruction or might be another of the common causes of segmentation fault.
We've had situations in the past where the low level of testing on CPUs 5 years out of production raised difficulties.
This may require full information sufficient to file a problem report on premier.intel.com.
According to the document (for -ax):
This option tells the compiler to generate multiple, feature-specific auto-dispatch code paths for Intel® processors if there is a performance benefit. It also generates a baseline code path. The Intel feature-specific auto-dispatch path is usually more optimized than the baseline path. Other options, such as O3, control how much optimization is performed on the baseline path.
The baseline code path is determined by the architecture specified by options -m or -x (Linux* OS and OS X*) or options /arch or /Qx (Windows* OS). While there are defaults for the
x option that depend on the operating system being used, you can specify an architecture and optimization level for the baseline code that is higher or lower than the default. The specified architecture becomes the effective minimum architecture for the baseline code path.
The Baseline code path is related to -m or -x, did you specify that options too? If not, you may need to know that the default -m or -x is NOT generic IA32, the default is SSE2 for Windows and Linux.
So, if your CPU is really old enough and you want to execute IA32 instruciton, you may try with "-axAVX -m ia32".
Thanks for your info, Shenghong.
My option also included baseline code path.Given below is my intel compiler (icpc) options that i had issue,
-Zp16 -fPIC -xSSE3 -axAVX -fvisibility=hidden
so i have SSE3 as the baseline code path.As you said -m ia32 is for old processors and if I use this instead of -xSSE3, I may get link errors,since my processor is not that old.
I had one more build option for the same code
-Zp16 -fPIC -xSSE3 -axSSE4.2 -fvisibility=hidden
This build was success full and I could run in my system.That's-why I said the problem is with -axAVX build.Any other alternative to make this build ( -axAVX) to work fine?
Regarding the segmentation fault(Tim mentioned),the code is fine since the other build option(-axSSE4.2) is working fine and my processor is not that old.Pls Check CPU info flags
Given below is my CPU info
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Pentium(R) 4 CPU 2.66GHz
stepping : 1
microcode : 0xd
cpu MHz : 2659.970
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc up pebs bts pni dtes64 monitor ds_cpl tm2 cid cx16 xtpr
bogomips : 5319.94
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
Yes, your processor supports SSE2 and is not that old. :)
It should be a bug of compiler if -axSSE4.2 works but -axAVX does not work.
When i debugged i could see that the crash occurred in one of the automatically dispatched function.
>> Is it possible that you test with a 'helloworld' code to see whether it can also reproduce the issue? And it will be helpful if you can share the code/binary and your debug session to show where (which instruction) it crashed.
Thanks Shenghong for your reply.
The helloworld(sample )program works fine.I don't have a debugger,so I cant share the debug info.Am using printf functions to trace where code crashes. And i could see that the code crashes in a function that is automatically dispatched.Below is the code snippet that crashed.
Int32 CritBandGroup( Math * restrict InputBins,
Math * restrict OutputBands,
Int32 NumCritBands, /* Num of Critical Bands */
Int32 * restrict CritBandPartitions )
Int32 n, k, ptr;
ptr = 0;
for( n = 0; n < NumCritBands; n++ )
temp = ( Math )0.0;
for( k = 0; k < CritBandPartitions
temp += InputBins[ptr];
The issue is with address of CritBandPartitions
FYI: when I build the compiler shows which all functions are targeted for automatic CPU dispatch.
I have no idea from the piece of code, is it possible that you provide a test case that can be built? I may have a try to see whether it can be reproduced on my system. If your code cannot be published publicly (confidential), you may send me using private message or submit on premier.intel.com.
Now i tried building the same code-base In X64(Intel 64 compiler)with same set of command line(-axAVX),and to my surprise the code successfully build and works fine(No crash).The crash occurs only in the X86 build of ICPC, confirming the issue to be compiler specific.
When i build in X64, CritBandGroup function was not automatically dispatched.Have anyone faced similar issues??