Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
210 Views

Illegal instruction when running AVX program

I used icpc to compile an AVX program with instruction such as  _mm256_maskload_epi32. 

The program compiles and links fine. When I run it, I got "Illegal instruction" error.

This is on Linux machine, "SLES11SP2-2 Revision 0 ia32e".

I do see "avx" on "/proc/cpuinfo".

If I only have instruction like _mm256_sub_ps, it is fine.

Please help. If it is the hardware limitation, how should I check?

Thanks,

0 Kudos
15 Replies
Highlighted
Black Belt
210 Views

If you had -mavx set when compiling, you should not see this promoted to AVX2.  You might check the asm file generated by -S.

Your  /proc/cpuinfo report seems to indicate avx should be OK.

0 Kudos
Highlighted
Beginner
210 Views

Thank you for your suggestion.

I have tried with "-mavx" and also "-march=core-avx2", they all give me same output: Illegal instruction

It seems this only happen with certain AVX instructions. For example, _mm256_sub_ps is fine, but _mm256_fmaddsub_pd will not.

What am I missing here?

 

0 Kudos
Highlighted
Employee
210 Views

@zlw:  
The "
_mm256_maskload_epi32" is a AVX2 intrinsic and when you include that as part of the code the binary will only work if you run on a HSW system for example that supports avx2 instruction set.  You can generate the asm file using the -S option and check that its equivalent instruction will be " vpmaskmovd" using the ymm registers accordingly.  On the other hand, the  intrinsic "_mm256_sub_ps" will work on a system that supports avx (like SNB) and its equivalent instruction you'll find in the asm file as "vsubps".   

So, if your code has avx instrinsics then you'll need to compile with -xAVX and if you have any avx2 intrinsics then compile with -xCORE_AVX2 switch. Of course, you'll need to run the binary supporting the corresponding intrinsics you use in your code. The https://software.intel.com/sites/landingpage/IntrinsicsGuide/ guide gives the list of supported intrinsics including avx and avx2.  
If your code doesn't use any intrinsics per-se, then using the switch -xHOST will use the highest available SIMD set on the system and the asm generated will reflect the systems support for the available intrinsics accordingly. 

_Kittur

0 Kudos
Highlighted
Employee
210 Views

BTW, you can use the manual dispatch procedures to dispatch a particular routine to the processor of choice. The article  https://software.intel.com/en-us/articles/how-to-manually-target-2nd-generation-intel-core-processor... should give some details on its usage so you can target sections of code to the processor of choice as well which might be useful.

_Kittur

0 Kudos
Highlighted
Beginner
210 Views

Thank you. 

I tried with -xHOST and got same error message.

So if I only see avx in the /proc/cpuinfo, without avx2, does it mean my system doesn't support avx2?

Thank you.

0 Kudos
Highlighted
Employee
210 Views

+Hi,
As I noted earlier, the __mm256_maskload_epi32() function is only provided by AVX2 instruction set and not by AVX set. Therefore if you use that function in your code and since this is an intrinsic function the compiler will generate asm for that function which will be an AVX2 instruction.  Hence when you run the binary it'll core-dump if you run it on an system that only supports AVX and not AVX2.

That said, the -xHOST switch is to let the compiler know to take the max available SIMD set on  the system you compile for the application build.  But, since you're explicitly using an intrinsic that can only be generated to an equivalent asm and in this case an avx2 instruction as you can see in the asm file which will be equivalent to "vpmaskmovd".   In general you should make sure to use -xAVX switch to compile for AVX and -xCORE_AVX2 for generating code for processors supporting AVX2.  And if you have functions that need to be dispatched according to the processors the application is run on, then you need to use manual dispatch procedures that I noted earlier. 

And, yes "cat /proc/cpuinfo" shows only avx then it only supports AVX.  If it shows "avx2" then it supports AVX2. So, a SandyBridge (2nd gen) system only supports AVX while a Haswell (4th gen) system supports AVX2.  Hope this makes it clear? 

_Kittur


 

 

0 Kudos
Highlighted
Black Belt
210 Views

@zlw

Can you check which Intel CPU do you have?

0 Kudos
Highlighted
Employee
210 Views

@zlw: Is the issue resolved now at your end? @iliyapolak: I think he's running on SNB  but yes it'd be nice to know the cpu info.

_Kittur

0 Kudos
Highlighted
Black Belt
210 Views

@Kittur

I was thinking the same.

0 Kudos
Highlighted
Beginner
210 Views

Sorry for the late response.

Yes, the problem is resolved. As Kittur said, I tried on one machine with avx flag on cpuinfo and certain instruction doesn't work. I got another machine with same avx flag (no avx2 flag) and it works.

So I guess, the first machine is SNB and the second is HSW? Honestly, I can't tell. 

0 Kudos
Highlighted
Employee
210 Views

@zlw:  Thanks for the confirmation. Well, if the processor on the first system indicates avx then that supports AVX (and is generally a SNB system). If the system indicates AVX2 then it's a Haswell system. So the instruction you're trying can only work on a Haswell system and will fail to run on a sandy bridge system (cpuinfo will only show avx), fyi.

_Kittur

0 Kudos
Highlighted
Employee
210 Views

@zlw: I assume your questions has been answered and will close this issue accordingly, thx.

_Kittur

0 Kudos
Highlighted
Beginner
210 Views

Hi Kittur,

I think you misread my answer. I said the second machine also only show avx flag and my program works.

Thanks

0 Kudos
Highlighted
Black Belt
210 Views

I suppose if your linux isn't up to date it may not show avx2 in the flags on a haswell CPU. You could look up your CPU at ark.intel.com  If the os supports avx then avx2 would work when the CPU supports it.

0 Kudos
Highlighted
Employee
210 Views

@zlw: Np, thanks for your reply. BTW, Tim's response answers your question. If there's no avx2 support and if you code has an avx2 instruction it should fail! Unless you have used manual dispatch procedure to dispatch the same routine on two different CPUs which I don't think it is based on what you mention.


_Kittur

0 Kudos