Intel® C++ Compiler
Support and discussions for creating C++ code that runs on platforms based on Intel® processors.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!

Illegal instruction when running AVX program

missing__zlw
Beginner
1,640 Views

I used icpc to compile an AVX program with instruction such as  _mm256_maskload_epi32. 

The program compiles and links fine. When I run it, I got "Illegal instruction" error.

This is on Linux machine, "SLES11SP2-2 Revision 0 ia32e".

I do see "avx" on "/proc/cpuinfo".

If I only have instruction like _mm256_sub_ps, it is fine.

Please help. If it is the hardware limitation, how should I check?

Thanks,

0 Kudos
21 Replies
TimP
Black Belt
1,598 Views

If you had -mavx set when compiling, you should not see this promoted to AVX2.  You might check the asm file generated by -S.

Your  /proc/cpuinfo report seems to indicate avx should be OK.

missing__zlw
Beginner
1,598 Views

Thank you for your suggestion.

I have tried with "-mavx" and also "-march=core-avx2", they all give me same output: Illegal instruction

It seems this only happen with certain AVX instructions. For example, _mm256_sub_ps is fine, but _mm256_fmaddsub_pd will not.

What am I missing here?

 

Kittur_G_Intel
Employee
1,598 Views

@zlw:  
The "
_mm256_maskload_epi32" is a AVX2 intrinsic and when you include that as part of the code the binary will only work if you run on a HSW system for example that supports avx2 instruction set.  You can generate the asm file using the -S option and check that its equivalent instruction will be " vpmaskmovd" using the ymm registers accordingly.  On the other hand, the  intrinsic "_mm256_sub_ps" will work on a system that supports avx (like SNB) and its equivalent instruction you'll find in the asm file as "vsubps".   

So, if your code has avx instrinsics then you'll need to compile with -xAVX and if you have any avx2 intrinsics then compile with -xCORE_AVX2 switch. Of course, you'll need to run the binary supporting the corresponding intrinsics you use in your code. The https://software.intel.com/sites/landingpage/IntrinsicsGuide/ guide gives the list of supported intrinsics including avx and avx2.  
If your code doesn't use any intrinsics per-se, then using the switch -xHOST will use the highest available SIMD set on the system and the asm generated will reflect the systems support for the available intrinsics accordingly. 

_Kittur

Kittur_G_Intel
Employee
1,598 Views

BTW, you can use the manual dispatch procedures to dispatch a particular routine to the processor of choice. The article  https://software.intel.com/en-us/articles/how-to-manually-target-2nd-generation-intel-core-processor... should give some details on its usage so you can target sections of code to the processor of choice as well which might be useful.

_Kittur

missing__zlw
Beginner
1,598 Views

Thank you. 

I tried with -xHOST and got same error message.

So if I only see avx in the /proc/cpuinfo, without avx2, does it mean my system doesn't support avx2?

Thank you.

Kittur_G_Intel
Employee
1,598 Views

+Hi,
As I noted earlier, the __mm256_maskload_epi32() function is only provided by AVX2 instruction set and not by AVX set. Therefore if you use that function in your code and since this is an intrinsic function the compiler will generate asm for that function which will be an AVX2 instruction.  Hence when you run the binary it'll core-dump if you run it on an system that only supports AVX and not AVX2.

That said, the -xHOST switch is to let the compiler know to take the max available SIMD set on  the system you compile for the application build.  But, since you're explicitly using an intrinsic that can only be generated to an equivalent asm and in this case an avx2 instruction as you can see in the asm file which will be equivalent to "vpmaskmovd".   In general you should make sure to use -xAVX switch to compile for AVX and -xCORE_AVX2 for generating code for processors supporting AVX2.  And if you have functions that need to be dispatched according to the processors the application is run on, then you need to use manual dispatch procedures that I noted earlier. 

And, yes "cat /proc/cpuinfo" shows only avx then it only supports AVX.  If it shows "avx2" then it supports AVX2. So, a SandyBridge (2nd gen) system only supports AVX while a Haswell (4th gen) system supports AVX2.  Hope this makes it clear? 

_Kittur


 

 

Gosadfasdf
Beginner
874 Views

I encounter the same problem. My cpuinfo is listed as below. How I can I set a flag to them?

1 processor : 0
2 vendor_id : GenuineIntel
3 cpu family : 6
4 model : 23
5 model name : Intel(R) Xeon(R) CPU E5450 @ 3.00GHz
6 stepping : 6
7 microcode : 0x60f
8 cpu MHz : 2992.497
9 cache size : 6144 KB
10 physical id : 0
11 siblings : 4
12 core id : 0
13 cpu cores : 4
14 apicid : 0
15 initial apicid : 0
16 fpu : yes
17 fpu_exception : yes
18 cpuid level : 10
19 wp : yes
20 flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm pti dtherm
21 bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
22 bogomips : 5984.99
23 clflush size : 64
24 cache_alignment : 64
25 address sizes : 38 bits physical, 48 bits virtual
26 power management:

Viet_H_Intel
Moderator
851 Views

_mm256_maskload_epi32 intrinsic is for AXV2 whereas _mm256_sub_ps is an intrinsic for AVX. So, if you use _mm256_maskload_epi32 make sure your system supports AVX2.
cat /proc/cpuinfo|grep AVX2 should show if your system supports AVX2 or not.

Gosadfasdf
Beginner
844 Views

My cpu does not support avx. It only support:

"fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm pti dtherm"

If I really want to use the intel compiler 2021, what flag should I set to use generate instruction sets that works. My computer is ""model name : Intel(R) Xeon(R) CPU E5450 @ 3.00GHz

The error information when I am running the program is:

"forrtl: severe (168): Program Exception - illegal instruction
Image PC Routine Line Source
libifcoremt.so.5 00007FA125F7462C for__signal_handl Unknown Unknow

libmpi.so.12.0.0 00007FA1200E439A impi_malloc Unknown Unknown

libmpi.so.12.0.0 00007FA12017C1DD Unknown Unknown Unknown
libmpi.so.12.0.0 00007FA12017B84B MPI_Init Unknown Unknown
libmpifort.so.12. 00007FA11FA58D9B MPI_INIT Unknown Unknown
abinit 0000000002A82936 m_xmpi_mp_xmpi_in 722 m_xmpi.F90
abinit 000000000040A865 MAIN__ 202 abinit.F90
abinit 000000000040A3F2 Unknown Unknown Unknown
libc-2.28.so 00007FA11CAF37B3 __libc_start_main Unknown Unknown
abinit 000000000040A2FE Unknown Unknown Unknown

"

I have tried to use -xHost when I am compiling the programs. But it seems the program still has illegal instruction sets.

Best wishes

Jiahao

Viet_H_Intel
Moderator
839 Views

If you use intrinsic, you have to make sure your system support them. Are you using any intrinsics in your code? 

 

Gosadfasdf
Beginner
760 Views

I am using abinit-9.4.1. There is no way I can tell whether the code is using an intrinsic specified function. I was thinking about whether any command flags that can enable me to avoid such a specific function. PS: when I am using gnu compilers. the code can work.
best wishes

Jiahao

Viet_H_Intel
Moderator
755 Views

What Intel compiler version are you using?

I am not sure what happened, with -xHost option, the compiler only generates code for the highest instruction set avail on the compilation host machine. So, when you run on the same host, it should be able to run. 

Can you provide a completed options you use?

Thanks,

Bernard
Black Belt
1,598 Views

@zlw

Can you check which Intel CPU do you have?

Kittur_G_Intel
Employee
1,598 Views

@zlw: Is the issue resolved now at your end? @iliyapolak: I think he's running on SNB  but yes it'd be nice to know the cpu info.

_Kittur

Bernard
Black Belt
1,598 Views

@Kittur

I was thinking the same.

missing__zlw
Beginner
1,598 Views

Sorry for the late response.

Yes, the problem is resolved. As Kittur said, I tried on one machine with avx flag on cpuinfo and certain instruction doesn't work. I got another machine with same avx flag (no avx2 flag) and it works.

So I guess, the first machine is SNB and the second is HSW? Honestly, I can't tell. 

Kittur_G_Intel
Employee
1,598 Views

@zlw:  Thanks for the confirmation. Well, if the processor on the first system indicates avx then that supports AVX (and is generally a SNB system). If the system indicates AVX2 then it's a Haswell system. So the instruction you're trying can only work on a Haswell system and will fail to run on a sandy bridge system (cpuinfo will only show avx), fyi.

_Kittur

Kittur_G_Intel
Employee
1,598 Views

@zlw: I assume your questions has been answered and will close this issue accordingly, thx.

_Kittur

missing__zlw
Beginner
1,598 Views

Hi Kittur,

I think you misread my answer. I said the second machine also only show avx flag and my program works.

Thanks

TimP
Black Belt
1,598 Views

I suppose if your linux isn't up to date it may not show avx2 in the flags on a haswell CPU. You could look up your CPU at ark.intel.com  If the os supports avx then avx2 would work when the CPU supports it.

Reply