Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Christian_B_5
Beginner
3,016 Views

Please verify that both the operating system and the processor support Intel(R) MOVBE, F16C, FMA, BMI, LZCNT and AVX2 instructio

Jump to solution

Hi everybody,

We just built python today using icc 2016. On the dev machine everything works fine, but on our CI server all our automated tests for python fail (crash). When I try to use the python command line, I'm getting the following error message: "Please verify that both the operating system and the processor support Intel(R) MOVBE, F16C, FMA, BMI, LZCNT and AVX2 instructions." After printing, python exits.

I printed out /proc/cpuinfo for both machines:

dev machine:

model name      : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz
stepping        : 2
microcode       : 49
cpu MHz         : 2599.785
cache size      : 20480 KB
physical id     : 0
siblings        : 16
core id         : 7
cpu cores       : 8
apicid          : 15
initial apicid  : 15
fpu             : yes
fpu_exception   : yes
cpuid level     : 15
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid fsgsbase bmi1 avx2 smep bmi2 erms invpcid

on the CI machine:

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 45
model name      : Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
stepping        : 2
cpu MHz         : 2599.999
cache size      : 20480 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts xtopology tsc_reliable nonstop_tsc aperfmperf unfair_spinlock pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx hypervisor lahf_lm ida arat epb pln pts dts

So it's apparent, that the CI machine does not have all flags enabled that the dev machine has. Question is: which compiler flags can/must I (un)set to get the Python binaries to work on both machines?

 

0 Kudos
1 Solution
Yuan_C_Intel
Employee
3,016 Views

Hi, Tim and Christian

AVX2 doubles width of integer vector instructions to 256 bits, and adds FMA.

I agree it's worth to fully test on both machines on different code paths. Maybe in some cases AVX runs faster on an AVX2 platform, but in most cases I met AVX2 is still better.

Hope it helps.

Thanks.

 

View solution in original post

5 Replies
Yuan_C_Intel
Employee
3,016 Views

Hi, Chrisitian

Your dev machine used E5-2640 v3 supports AVX 2.0, while your CI machine supports AVX only.

Which compiler flag you used on your dev machine? Is it -xHost?

You may try -xAVX -axCORE-AVX2 to run on both machines.

Hope this helps.

Thanks.

TimP
Black Belt
3,016 Views

Yolanda's answer appears correct, but -mAVX may be sufficient.  It would be difficult to see sufficient performance advantage in AVX2 to compensate for code expansion with multiple architecture paths, and you may wish to test fully on both machines if they are taking different code paths.

AVX optimization in Intel compilers is sometimes better tuned than AVX2, so AVX may actually run faster on an AVX2 platform, although that appears to border on an actionable bug.

Christian_B_5
Beginner
3,016 Views

Hi Yolanda,

Yes, xHost is set - we used to have an older dev machine where this flag didn''t cause any problems. I'll go through the docs and play with the options.

 

Thanks,

Christian

 

Yuan C. (Intel) wrote:

Hi, Chrisitian

Your dev machine used E5-2640 v3 supports AVX 2.0, while your CI machine supports AVX only.

Which compiler flag you used on your dev machine? Is it -xHost?

You may try -xAVX -axCORE-AVX2 to run on both machines.

Hope this helps.

Thanks.

Christian_B_5
Beginner
3,016 Views

Thanks Tim!

Tim P. wrote:

Yolanda's answer appears correct, but -mAVX may be sufficient.  It would be difficult to see sufficient performance advantage in AVX2 to compensate for code expansion with multiple architecture paths, and you may wish to test fully on both machines if they are taking different code paths.

AVX optimization in Intel compilers is sometimes better tuned than AVX2, so AVX may actually run faster on an AVX2 platform, although that appears to border on an actionable bug.

Yuan_C_Intel
Employee
3,017 Views

Hi, Tim and Christian

AVX2 doubles width of integer vector instructions to 256 bits, and adds FMA.

I agree it's worth to fully test on both machines on different code paths. Maybe in some cases AVX runs faster on an AVX2 platform, but in most cases I met AVX2 is still better.

Hope it helps.

Thanks.

 

View solution in original post

Reply