Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
32 Views

Mixed (with/without AVX) project

Hello

I've a static lib bult with -xAVX. In runtime if CPU has no avx the lib is unused. However this lib presence forces my app to crash. In Link Map file I see ICC generates same mangled name for std calls, but code is different with/without.avx. For example for std::numeric_limits<float>::infinity()  the name is __ZNSt14numeric_limitsIfE8infinityEv but it's built with avx instructions. So crash when it's called from non-avx code.

How to solve?

Thx

0 Kudos
8 Replies
Highlighted
32 Views

Hi Igor,

This is expected behavior of -xAVX option. You may refer to the compiler document for -x (https://software.intel.com/en-us/node/522845):

The specialized code generated by this option may only run on a subset of Intel® processors. The resulting executables created from these option code values can only be run on Intel® processors that support the indicated instruction set.

To resolve your issue, you need to use -mavx, which will generate both AVX and none-AVX code.

Thanks,

Shenghong

0 Kudos
Highlighted
Beginner
32 Views

Hi, Shenghong

I can't check because I can't reproduce the situation, now it works ;-( Can you please explain more the difference between -xAVX and -mavx? If a library is built for AVX - it should use those instructions, it's normal/expected. But built-in calls with AVX should not be used out of the lib. Does -mavx provide this and how? I can't find in doc

Thx

0 Kudos
Highlighted
Black Belt
32 Views

-xAVX implies some checking for an Intel AVX capable CPU, presumably throwing an error to stderr if not passing.  It you want a code path for CPUs which don't pass the test, you need -axAVX.  -mavx doesn't involve any such checks.

0 Kudos
Highlighted
Employee
32 Views

Tim’s right: The option -axavx  gives both SSE (uses default SSE) and AVX code paths  and you can use  –x or  –m (/arch:) switches to modify the default SSE code path  Fpr example: "–axavx –xsse4.2" to target both Nehalem and avx

BTW, the article https://software.intel.com/en-us/articles/how-to-compile-for-intel-avx/ should give you more details as well.

_Kittur   

0 Kudos
Highlighted
32 Views

Hi Igor,

yes, Tim and Kittur are correct, you need to use -ax instead of -m/-x (sorry for my fault in my first reply. :( )

-xAVX: generate the instruction for Intel processor supporting AVX. The run-time will check the processor type and if it is not AVX processor, it will crash.

-mavx: similar to -x, but will also work for non-Intel processor which supports AVX.

-axavx: auto-dispatch at run time, generate a baseline code path and AVX code path. The baseline path is decided by -x or -m. By default, it is SSE2 (-m default is SSE2 and -x default is none, so...).

Some Examples:

-axavx: run on processors supporting SSE2 above, and may optimize for AVX processor specifically.

-axavx -xsse4.2: run on Intel processors supporting SSE4.2 above and optimize for Intel AVX processors.

-axavx -msse4.2: run on any processors supporting SSE4.2 above and optimize for Intel AVX processors.

The compiler document has a more detailed introduction on these options. Hope it helps this time!

Thanks,

Shenghong

0 Kudos
Highlighted
32 Views

Hi Igor,

              Just to add to the above discussion, When you want to dispatch the code on the machine which doesn't support AVX, then multiple code path generation method has to be enabled using the /Qax option as mentioned above. What compiler does when you do multiple code path generation is to check the CPUID on which architecture the code has been deployed on.

So when you see the sample asm being generated when you use /QxAVX,SSE4.2 , this would do an code generation something similar to if-else condition :-

main PROC
 

.B1.3::                       
        mov       eax, DWORD PTR [__intel_cpu_feature_indicator] ;71.1
        
.B1.4::                         
        add       rsp, 8                                      
        jmp       main.R                                    
                               
.B1.6::                         ; Preds .B1.3
        test      BYTE PTR [__intel_cpu_feature_indicator], 1  
        je        .B1.8         ; Prob 10%                    
                                
.B1.7::                         
        add       rsp, 8                                        
        jmp       main.A    

You may see from the above asm code (I have stripped the unnecessary part) it has 2 paths for the same main ( either main.R  or main.A is been picked based on the value of "__intel_cpu_feature_indicator".

Hope this gives some idea.

Regards,

Sukruth H V
                               

0 Kudos
Highlighted
Beginner
32 Views

Hi All

Thx for your replies. When I replaced one dylib with static one, the problem appeared again. I've tried -mavx, no difference, same crash in runtime. With -axAVX I've got a lot of unresolved(s). I printed macro and see that __AVX__ is not more defined. Adding it to prerocessors definitions I've got the compile error:

error: identifier "__popcnt" is undefined

Please note that I don't need to generate "auto-detect" code, the library does it explicit (at least it should). Any idea?

Thx

0 Kudos
Highlighted
32 Views

Hi Igor,

Is it possible that you create a simple test case to show your issue/usage, so that the discussion may be more specific on your usage?

Thanks,

Shenghong

0 Kudos