Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

11.0.075 Win32 produces SSE2 instructions with -QxSSE

thorvald_natvig
Beginner
500 Views
Hi,

I'm compiling my application with -QxSSE -GL, since I have users that have non-SSE2 capable machines. I just got a minidump from such a user, and the compiler has issued a 'movsd xmm, mem' instruction. The subroutine deals only with floats, but does have some SSE intrinsics.

As far as I can tell, the code which causes the problems is:

mem[2] = _mm_setr_ps(_mem[8], _mem[9], 0, 0);
den[2] = _mm_setr_ps(_den[8], _den[9], 0, 0);

mem[] and den[] are __m128, while _mem and _den are float *.
The compiler cleverly restructures each line into a single movsd (for _mem[8], _mem[9]) followed by xorps (for the 0, 0) and movlhps (to merge the two). Problem is movsd is a SSE2 command, which -QxSSE should have disabled. As far as I can see, this is the only SSE2 command used.

If I remove '-GL', the problem goes away, but so does some of the performance, and the users non non-SSE2 capable processors are the ones that need the optimizations the most.

Is there a workaround I can apply to tell the compiler that SSE is ok, but SSE2 really isn't, no matter how fancy it is?

Apologies if this is fixed in 11.1.048; I keep getting linker errors about symbol files with that release, so I've had to stay on 11.0 for now.
0 Kudos
5 Replies
TimP
Honored Contributor III
500 Views
Since ICL 11.0, the only option which doesn't generate SSE2 is /arch:ia32. 10.0 had a -QxK option for SSE, but it wasn't reliable in library support.
0 Kudos
JenniferJ
Moderator
500 Views
Quoting - tim18
Since ICL 11.0, the only option which doesn't generate SSE2 is /arch:ia32. 10.0 had a -QxK option for SSE, but it wasn't reliable in library support.

Tim is right. Please use /arch:ia32.

Apologies if this is fixed in 11.1.048; I keep getting linker errors about symbol files with that release, so I've had to stay on 11.0 for now.

do mean the .sbr file issue below? If so, it's being fixed as we speak.

BSCMAKE: error BK1506 : cannot open file 'C:Dev_build_intDSPTestRelDebugDspFilter.sbr': No such file or directory

Jennifer
0 Kudos
thorvald_natvig
Beginner
500 Views

Tim is right. Please use /arch:ia32.

Apologies if this is fixed in 11.1.048; I keep getting linker errors about symbol files with that release, so I've had to stay on 11.0 for now.

do mean the .sbr file issue below? If so, it's being fixed as we speak.

BSCMAKE: error BK1506 : cannot open file 'C:Dev_build_intDSPTestRelDebugDspFilter.sbr': No such file or directory

Jennifer

No, in 11.1.048 I'm seeing

mumble_pch.obj : fatal error LNK1318: Unexpected PDB error; RPC (23) '(0x000006BA)'

The same code compiles without any problems using 11.0.075.. Apart from the unwanted SSE2 code, that is.

I can't use 10.1 for this, as that gives me missing vtable symbols in declspec(dllimport)ed C++ classes.

Right now it looks like I'll have to split out my performance critical code into a DLL, without any external C++ classes, compile that with 10.1 -QxK, and compile the rest with 11.0 with -arch:ia32 .. That is more than a little bit messy though, and I'd really like to avoid it if possible. Compiling all the code with -arch:ia32 isn't an option, as I need the vectorized speedup of the performance critical parts to be able to run in realtime on the non-SSE2 processors.

0 Kudos
JenniferJ
Moderator
500 Views
Quoting - thorvald.natvig
No, in 11.1.048 I'm seeing

mumble_pch.obj : fatal error LNK1318: Unexpected PDB error; RPC (23) '(0x000006BA)'


This issue was reported before but got fixed. I verified the original testcase, it is indeed fixed.

so this maybe caused by a different scenario. Is it possible for you to send me more info or a testcase?

Thanks,
Jennifer
0 Kudos
thorvald_natvig
Beginner
500 Views

This issue was reported before but got fixed. I verified the original testcase, it is indeed fixed.

so this maybe caused by a different scenario. Is it possible for you to send me more info or a testcase?

Thanks,
Jennifer

I haven't been able to create any minimal testcase for this; it happens when I link my application, but doesn't happen on smaller tests.
I'll test some more and see if I can narrow it down a bit, and if so I'll post a followup here.
0 Kudos
Reply