- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello
I've a static lib bult with -xAVX. In runtime if CPU has no avx the lib is unused. However this lib presence forces my app to crash. In Link Map file I see ICC generates same mangled name for std calls, but code is different with/without.avx. For example for std::numeric_limits<float>::infinity() the name is __ZNSt14numeric_limitsIfE8infinityEv but it's built with avx instructions. So crash when it's called from non-avx code.
How to solve?
Thx
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Igor,
This is expected behavior of -xAVX option. You may refer to the compiler document for -x (https://software.intel.com/en-us/node/522845):
The specialized code generated by this option may only run on a subset of Intel® processors. The resulting executables created from these option code values can only be run on Intel® processors that support the indicated instruction set.
To resolve your issue, you need to use -mavx, which will generate both AVX and none-AVX code.
Thanks,
Shenghong
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Shenghong
I can't check because I can't reproduce the situation, now it works ;-( Can you please explain more the difference between -xAVX and -mavx? If a library is built for AVX - it should use those instructions, it's normal/expected. But built-in calls with AVX should not be used out of the lib. Does -mavx provide this and how? I can't find in doc
Thx
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
-xAVX implies some checking for an Intel AVX capable CPU, presumably throwing an error to stderr if not passing. It you want a code path for CPUs which don't pass the test, you need -axAVX. -mavx doesn't involve any such checks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tim’s right: The option -axavx gives both SSE (uses default SSE) and AVX code paths and you can use –x or –m (/arch:) switches to modify the default SSE code path Fpr example: "–axavx –xsse4.2" to target both Nehalem and avx
BTW, the article https://software.intel.com/en-us/articles/how-to-compile-for-intel-avx/ should give you more details as well.
_Kittur
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Igor,
yes, Tim and Kittur are correct, you need to use -ax instead of -m/-x (sorry for my fault in my first reply. :( )
-xAVX: generate the instruction for Intel processor supporting AVX. The run-time will check the processor type and if it is not AVX processor, it will crash.
-mavx: similar to -x, but will also work for non-Intel processor which supports AVX.
-axavx: auto-dispatch at run time, generate a baseline code path and AVX code path. The baseline path is decided by -x or -m. By default, it is SSE2 (-m default is SSE2 and -x default is none, so...).
Some Examples:
-axavx: run on processors supporting SSE2 above, and may optimize for AVX processor specifically.
-axavx -xsse4.2: run on Intel processors supporting SSE4.2 above and optimize for Intel AVX processors.
-axavx -msse4.2: run on any processors supporting SSE4.2 above and optimize for Intel AVX processors.
The compiler document has a more detailed introduction on these options. Hope it helps this time!
Thanks,
Shenghong
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Igor,
Just to add to the above discussion, When you want to dispatch the code on the machine which doesn't support AVX, then multiple code path generation method has to be enabled using the /Qax option as mentioned above. What compiler does when you do multiple code path generation is to check the CPUID on which architecture the code has been deployed on.
So when you see the sample asm being generated when you use /QxAVX,SSE4.2 , this would do an code generation something similar to if-else condition :-
main PROC
.B1.3:: mov eax, DWORD PTR [__intel_cpu_feature_indicator] ;71.1 .B1.4:: add rsp, 8 jmp main.R .B1.6:: ; Preds .B1.3 test BYTE PTR [__intel_cpu_feature_indicator], 1 je .B1.8 ; Prob 10% .B1.7:: add rsp, 8 jmp main.A
You may see from the above asm code (I have stripped the unnecessary part) it has 2 paths for the same main ( either main.R or main.A is been picked based on the value of "__intel_cpu_feature_indicator".
Hope this gives some idea.
Regards,
Sukruth H V
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi All
Thx for your replies. When I replaced one dylib with static one, the problem appeared again. I've tried -mavx, no difference, same crash in runtime. With -axAVX I've got a lot of unresolved(s). I printed macro and see that __AVX__ is not more defined. Adding it to prerocessors definitions I've got the compile error:
error: identifier "__popcnt" is undefined
Please note that I don't need to generate "auto-detect" code, the library does it explicit (at least it should). Any idea?
Thx
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Igor,
Is it possible that you create a simple test case to show your issue/usage, so that the discussion may be more specific on your usage?
Thanks,
Shenghong
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page