- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My program does some computing on matrices.
I was suprised by fact that visual studio compiler generated faster code than icl.
I checked settings of project in vs and I think everything is set correct.
Command line looks like this:
/c /O2 /Qipo /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /EHsc /MT /GS /arch:SSE3 /fp:fast /Fo"x64\\Release/" /W1 /nologo /Qopenmp
I disassembled .obj file and find out that icl generates movaps instructions in place of _mm_load_pd intrinsics.
(WTF?) and generated code uses only 8 xmm registers. (WTF2?)
Code generated by vs looks OK, movapd in place of _mm_load_pd and all available registers used.
Am I doing something wrong? Are there hidden compiler settings or something? Really, wtf?
My configuration:
Intel C++ Intel 64 Compiler XE 12.0.2.154
Visual Studio 2005
Windows 7 Professional 64 bit
CPU: Intel Pentium T4200
Motherboard: Acer JV50
I was suprised by fact that visual studio compiler generated faster code than icl.
I checked settings of project in vs and I think everything is set correct.
Command line looks like this:
/c /O2 /Qipo /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /EHsc /MT /GS /arch:SSE3 /fp:fast /Fo"x64\\Release/" /W1 /nologo /Qopenmp
I disassembled .obj file and find out that icl generates movaps instructions in place of _mm_load_pd intrinsics.
(WTF?) and generated code uses only 8 xmm registers. (WTF2?)
Code generated by vs looks OK, movapd in place of _mm_load_pd and all available registers used.
Am I doing something wrong? Are there hidden compiler settings or something? Really, wtf?
My configuration:
Intel C++ Intel 64 Compiler XE 12.0.2.154
Visual Studio 2005
Windows 7 Professional 64 bit
CPU: Intel Pentium T4200
Motherboard: Acer JV50
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You don't give much information here. Use of movaps in place of movapd is a standard optimization, saving 1 byte of code. It's conceivable that accidental alignments might come out worse. MSVC from VS2005 is often not as fast as the ones from VS2008SP1 or VS2010.
You can't tell from the number of different named registers whether there will be a physical difference, as hardware register renaming will make use of more registers. I'm trying to remember how long it's been since I saw a CPU without hardware renaming; it makes me feel my age.
You can't tell from the number of different named registers whether there will be a physical difference, as hardware register renaming will make use of more registers. I'm trying to remember how long it's been since I saw a CPU without hardware renaming; it makes me feel my age.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page