Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7954 Discussions

ICC's arch:SSE3 generated code can't run on AMD CPU that support SSE3

xunxun
Beginner
728 Views
Hi,

I found that if I built the code using /arch:SSE3, and it will can't run on AMD CPU which supports SSE3.

The link http://techreport.com/articles.x/8327/1 said AMD SSE3 support 11 of 13 SSE3.

So I want to know, how can I build the code using SSE3 baseline to generate the executable compatible with AMD CPU?

And ICC SSE3 baseline has some dispatch mechanism, which can judge CPU architecture?

Thanks.
0 Kudos
6 Replies
mecej4
Honored Contributor III
728 Views
The version of the Intel C compiler that you used has a major bearing on this behavior, so please ascertain and report the version. If possible, download and install a recent version of the compiler.
0 Kudos
xunxun
Beginner
728 Views
Quoting mecej4
The version of the Intel C compiler that you used has a major bearing on this behavior, so please ascertain and report the version. If possible, download and install a recent version of the compiler.

Hi

I use 12.1.3.300

I think it's the latest edition
0 Kudos
TimP
Honored Contributor III
728 Views
By "can't run," do you meant your .exe opens the message that you built it with option which doesn't match your CPU, and checking says it doesn't match your CPU, or do you mean it fails with illegal instruction (which instruction)?
If you set /arch:SSE3 and didn't otherwise ask for dispatching or another architecture setting, the compiler generated code should have only the SSE3 code path, and no dispatching. Dispatching by CPU identification would still occur in library code. You should be able to avoid dispatching in the math library by setting /Qimf-arch-consistency:true There are other means for avoiding functions such as intel_fast_memcpy which choose cpu-dependent paths.
If you can show an example where the run-time library has dispatched non-SSE3 code, and give the specific identification of your CPU, that would be worth filing an issue on premier.intel.com (or here).
If it runs with /arch:SSE2 (the default), but not with /arch:SSE3 (which instruction fails?), that would be interesting information.
Normally, SSE3 code would be advantageous compared with SSE2 only if you have significant vectorized complex math. So it's even possible that no one has checked this issue on your CPU model.
Did you check whether your CPU is reporting via CPUID that it has SSE3?
0 Kudos
xunxun
Beginner
728 Views
Quoting TimP (Intel)
By "can't run," do you meant your .exe opens the message that you built it with option which doesn't match your CPU, and checking says it doesn't match your CPU, or do you mean it fails with illegal instruction (which instruction)?
If you set /arch:SSE3 and didn't otherwise ask for dispatching or another architecture setting, the compiler generated code should have only the SSE3 code path, and no dispatching. Dispatching by CPU identification would still occur in library code. You should be able to avoid dispatching in the math library by setting /Qimf-arch-consistency:true There are other means for avoiding functions such as intel_fast_memcpy which choose cpu-dependent paths.
If you can show an example where the run-time library has dispatched non-SSE3 code, and give the specific identification of your CPU, that would be worth filing an issue on premier.intel.com (or here).
If it runs with /arch:SSE2 (the default), but not with /arch:SSE3 (which instruction fails?), that would be interesting information.
Normally, SSE3 code would be advantageous compared with SSE2 only if you have significant vectorized complex math. So it's even possible that no one has checked this issue on your CPU model.
Did you check whether your CPU is reporting via CPUID that it has SSE3?

Thanks for the information.
I don't have AMD CPU.
The reason I want to ask is that

when I use /arch:SSE2 to build Firefox, all people's CPU which supports SSE2 works well.
but when I use /arch:SSE3, AMD users which also support SSE3 can't run it, which will cause -- Fatal Error: This program was not built to run on the processor in your system. (Someone told me, but I don't have AMD CPU)

AMD CPU

0 Kudos
TimP
Honored Contributor III
728 Views
I doubt you would want to build Firefox with SSE3, as you could not get a performance advantage, and SSE2 would avoid questions such as you have run into. /QaxSSE3 should work, as it would be looking only for SSE2 on AMD CPUs, and should generate only SSE2 code, unless the compiler finds rare opportunities for SSE3 in certan functions.
It does appear that the ICL library has a bug in that it is not recognizing SSE3 capability of these AMD desktop CPUs and is presenting the architecture error message. This would merit a problem report on premier.intel.com.
0 Kudos
xunxun
Beginner
728 Views
Quoting TimP (Intel)
I doubt you would want to build Firefox with SSE3, as you could not get a performance advantage, and SSE2 would avoid questions such as you have run into. /QaxSSE3 should work, as it would be looking only for SSE2 on AMD CPUs, and should generate only SSE2 code, unless the compiler finds rare opportunities for SSE3 in certan functions.
It does appear that the ICL library has a bug in that it is not recognizing SSE3 capability of these AMD desktop CPUs and is presenting the architecture error message. This would merit a problem report on premier.intel.com.

Yea. But someone said /arch:SSE3 is better than /arch:SSE2 /QaxSSE3 on Intel CPU, and other people said that /arch:SSE2 /QaxSSE3 is faster than /arch:SSE3.

I don't have a simple example and no AMD CPU, so I don't know how to submit the issue.

You can try the prebuilt firefox edition

http://pcxfirefox.googlecode.com/files/Firefox-11.0-enUS-pcx-win32-120324-pureICC-sse3-betterpgo.7z
0 Kudos
Reply