- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am using Intel C++ Windows Compiler (Professional Edition) v11 and facing following issue. Is there anybody else whose seen the same problem? If yes, please throw in your thoughts or comments/assistance/possible workarounds.
I cannot do explicit conversion of double values to signed integer since it causes runtime hang of the process for certain intel processors. However, the same works for unsigned int conversion (see below test.cpp):
[cpp]double doubleVal = 1; unsigned int uintVal = (unsigned int)doubleVal; //Works Fine int intVal = (int)doubleVal; //System hangs runtime.[/cpp]NOTES:
- Using the compiler in conjunction with Microsoft Visual Studio 2008 (VC9). Operating System used in Windows 7 Ultimate 64bit.
- Compiled with following flags: icl.exe -Qvc9 -Qms0 -QxK -nologo -Od -Gy -GF -EHa -fp:precise -Zc:wchar_t -Zc:forScope -W3 -G7 test .cpp
Problem occurs in following machines (where we have observed):
- Fujitsu Siemens Celsius R650 Intel Processor Xeon CPU E5440
- Dell Vostro 420 Intel Core 2 Duo
Program output (problem scenario):
Now attempting double to uint conversion
uintVal must be 1: 1
Now attempting double to int conversion
Now the program hangs indefinitely instead of displaying intVal must be 1: 1 and exiting.
Problem does NOT occur in following machines:
- Fujitsu Siemens Celsius R670 Intel Processor Xeon CPU X5570
- Dell Vostro 430 Intel Core i5 processor.
- HP Z800 Intel Processor Xeon CPU X5650
Now attempting double to uint conversion
uintVal must be 1: 1
Now attempting double to int conversion
intVal must be 1: 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The best which could be hoped for the no longer documented /QxK, after the separate SSE and SSE2 library support was eliminated, was that it should always work the same as /arch:IA32, but I didn't see any promise made of that.
I did agree already that the compiler didn't make enough effort to warn against obsolete options.
If the supported /arch:IA32 gave trouble back when 11.1 was the current compiler, a test case demonstrating it should have been submitted on premier.intel.com.
I don't know myself whether both of the quoted platforms are old enough to make it likely that the quoted options were never tested on the platform. The quoted option string seems rather complicated, enough that the exact combination might never have been tested significantly. It does even happen that the same Dell platform model is used for more than one CPU over time, such as an original Core2 not supporting SSE4 and a later one which does support SSE4. When working within supported options, and using up to date compilers, of course it's entirely reasonable to expect any issues which come up with an updated CPU model to be fixed.
I'm willing to assume, although 11.0 compiler is mentioned, that it was 11.1, since Windows 7 support didn't come until the middle of the 11.1 run (long after /QxK or /G7 would have been tested during development). It would have been relatively difficult to run 11.0 on Win7.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That is very unlikely, and probably not relevant. If the error occurs only on a specific machine, with a specific OS, compiler and set of compiler options, the less likely it is that someone else has stumbled on the same problem.
You did not post the code for test.cpp. I do not see how anyone can comment on a bug that has not been seen.
This program works fine with w_cproc_p_11.1.070, using your command line. I wondered why you are using compiler options relevant to obsolete 32-bit CPUs on a modern 64-bit CPU and a 64-bit OS (/QxK, /G7).
[cpp]#include#include using namespace std; main(){ double doubleVal = 1; unsigned int uintVal = (unsigned int)doubleVal;
int intVal = (int)doubleVal;
printf("D : %.3e U : %u I : %dn",doubleVal,uintVal,intVal); } [/cpp]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In case it's of any interest to you, the cast from double => signed int should be accomplished in line with 2 swaps of x87 rounding mode (no way could /G7 avoid the performance problem), while double => unsigned int should generate a library function call, thus disabling many optimizations. As the 11.1 compiler has no library versions corresponding to /QxK, the library call has to be the same as /QxIA32, so would have been tested in that mode. The effect in the presence of bugs would be entirely different, regardless of whether the bug is largely your responsibility (like using an unsupported option) or the compiler's (like not warning you vociferously enough about that option).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
bharathdude's sample program should not hang regardless of use of old/depreciated option switches. The compiler is (evidently) generating a bad code path decision: i.e. query system as to what instructions are available then taking one of multiple paths. One of the paths is bunged up.
The sample program looks small, bharathdude could add the option to produce an assembler listing file, then attach the file. The bad code path should easily be discovered with a visual walk through the listing.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I might add...
The compiler should be able to detect ill advised option combinations, report with warning, then produce code that works on the least capable system (most compatible). IOW under this circumstance produce code that works at the sacrifice of performance.
MHO
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The best which could be hoped for the no longer documented /QxK, after the separate SSE and SSE2 library support was eliminated, was that it should always work the same as /arch:IA32, but I didn't see any promise made of that.
I did agree already that the compiler didn't make enough effort to warn against obsolete options.
If the supported /arch:IA32 gave trouble back when 11.1 was the current compiler, a test case demonstrating it should have been submitted on premier.intel.com.
I don't know myself whether both of the quoted platforms are old enough to make it likely that the quoted options were never tested on the platform. The quoted option string seems rather complicated, enough that the exact combination might never have been tested significantly. It does even happen that the same Dell platform model is used for more than one CPU over time, such as an original Core2 not supporting SSE4 and a later one which does support SSE4. When working within supported options, and using up to date compilers, of course it's entirely reasonable to expect any issues which come up with an updated CPU model to be fixed.
I'm willing to assume, although 11.0 compiler is mentioned, that it was 11.1, since Windows 7 support didn't come until the middle of the 11.1 run (long after /QxK or /G7 would have been tested during development). It would have been relatively difficult to run 11.0 on Win7.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I discovered below:
PROBLEM with: icl.exe -Qvc9 -Qms0 -QxK -nologo -Od -Gy -GF -EHa -fp:precise -Zc:wchar_t -Zc:forScope -W3 -G7 test.cpp
NO PROBLEM: icl.exe -Qvc9 -Qms0 QxSSE2 -nologo -Od -Gy -GF -EHa -fp:precise -Zc:wchar_t -Zc:forScope -W3 -G7 test.cpp
As per (http://software.intel.com/en-us/articles/performance-tools-for-software-developers-intel-compiler-options-for-sse-generation-and-processor-specific-optimizations/) Qxk which is now QxSSE shall not work for processors earlier to those that use a specific version of SSE. From where i see, it is definitely a bug in intel that QxSSE works for Intel Core i5 but does not work on Intel Core2 Duo because Core2 Duo is older than Core i5. Anyway, From above article:-
2. Processor-specific options of the form /Qx on Windows*( -x
on Linux* or Mac OS* X) generate specialized code for processors
specified by
. The resulting executables from these
processor-specific options can only be run on the specified or later
Intel processors, as they incorporate optimizations specific to those
processors and use a specific version of the Streaming SIMD Extensions
(SSE) instruction set and/or the Intel Advanced Vector Extensions
(AVX) instruction set. This switch enables some optimizations not
enabled with the corresponding switches /arch:x
or -m
.
A run-time check is inserted in the resulting executable that will halt
the application if run on an incompatible processor. This is intended
to help you quickly find out that the program was not intended for the
processor it is running on and potentially avoids an illegal
instruction error. For this check to be effective, the source file
containing the main program or the dynamic library main function should
be compiled with this option enabled.
Where the value for can be:
AVX | May generate Intel AVX, SSE4.2, SSE4.1, SSSE3, SSE3, SSE2 and SSE instructions for Intel processors. Optimizes for 2nd generation Intel Core processors. |
SSE4.2 | May generate Intel SSE4.2, SSE4.1, SSSE3, SSE3, SSE2 and SSE instructions for Intel processors. Optimizes for the Intel Core i7, i5 and i3processor families and the Intel Xeon 55XX,56XX and 75XX series. |
SSE4.1 | May generate Intel SSE4.1, SSSE3, SSE3, SSE2 and SSE instructions for Intel processors. Optimizes for the 45nm Hi-k next generation Intel Core microarchitecture. |
SSSE3 | May generate Intel SSSE3, SSE3, SSE2 and SSE instructions for Intel processors. Optimizes for Intel Core microarchitecture. -xssse3 is the default for the Intel 64 compiler on Mac OS* X. |
SSE3_ATOM | May generate Intel SSSE3, SSE3, SSE2 and SSE instructions for Intel processors. Optimizes for the Intel Atom processor family and Intel Centrino Atom Processor Technology. |
SSE3 | May generate Intel SSE3, SSE2 and SSE instructions. Optimizes for the enhanced Pentium M processor microarchitecture and Intel Netburst microarchitecture. -xsse3 is the default for the IA-32 compiler on Mac OS* X. |
SSE2 | May generate Intel SSE2 and SSE instructions. Optimizes for the Intel Netburst microarchitecture. |
So, instead of Qxk, which above option should i use if my code has to work for all below processors:
- Intel Processor Xeon CPU E5440 and later
- Intel Processor Xeon CPU X5570 and later
- Intel Core 2 Duo and later.
Should i assume that its quite safe to use QxAVX so that my program will work on all existing processors and future ones too?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Intel C++ Compiler Professional for applications running on IA-32, Version 11.1 Build 20090511
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you don't use float data types, there wouldn't have been any advantage in /QxK even on the old compiler versions which implemented it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
.or.
use the option to add processor (runtime selected)optimized code path -Qax{max arch}
-QaxAVX or -QaxSSE4.2
Note difference between "x" and "ax" on -Q...
The execuitable size will be larger on -Qax... systems and will(may)run slightly slower on targeted system.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A combination like -QaxSSE3 -arch:ia32 will support both pre-SSE2 CPUs and take advantage of SSE3, with a larger executable size, as Jim said.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
However, the program succeeds when i disable optimization i.e using od flag instead of o2.
I downloaded the trial version of Intel Compiler XE 12.0 and tried out. All options work with o2 as well as od. However, i ve decided to stick to QaxSSE4.1 option to cater to all processor types starting from Core 2 Duo upto Xeon E5550.
I am 100 percent sure that this is a bug in the intel compiler 11.1 since the same options which dont work in 11.1 work in 12.0 for all processors.
Thanks to everyone for your suggestions
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page