- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would like to report the following behavior.
[cpp]#includeIn debug mode the ouput is:
which is the expected behavior. In Release mode however, the output is:
I am using ICLVersion 12.1 Build 20120130 with Visual Studio Version 10.0.40219.1 SP1Rel on Windows 7 running on an HP EliteBook 8540w (Intel Core i7 Q840 CPU).
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you please provide the full options list you used for project build?
Regards,
Hbuert.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Compiler flags:
/Zi /nologo /W3 /O2 /Oi /Qipo /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /EHsc /GS /Gy /fp:precise /Zc:wchar_t /Zc:forScope /Fp"Release\intel_compiler_bug.pch" /Fa"Release" /Fo"Release" /Fd"Release\vc100.pdb" /Gd
Linker flags:
/OUT:"Visual Studio 2010\Projects\intel_compiler_bug\Release\intel_compiler_bug.exe" /INCREMENTAL:NO /NOLOGO "kernel32.lib" "user32.lib" "gdi32.lib" "winspool.lib" "comdlg32.lib" "advapi32.lib" "shell32.lib" "ole32.lib" "oleaut32.lib" "uuid.lib" "odbc32.lib" "odbccp32.lib" /MANIFEST /ManifestFile:"Release\intel_compiler_bug.exe.intermediate.manifest" /ALLOWISOLATION /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /DEBUG /PDB:"Visual Studio 2010\Projects\intel_compiler_bug\Release\intel_compiler_bug.pdb" /SUBSYSTEM:CONSOLE /OPT:REF /OPT:ICF /PGD:"Visual Studio 2010\Projects\intel_compiler_bug\Release\intel_compiler_bug.pgd" /LTCG /TLBID:1 /DYNAMICBASE /NXCOMPAT /MACHINE:X86
I created the project from scratch so these should be the default options.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hubert.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This answer is surprising -- and also incorrect, with /fp:fast I now get :
0 0 0 4 4 4 8 8 8
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for clarification. Let me investigate further.
Hubrt.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Workarounds would be todisable the vectorizer for the inner loop (add #pragma novector in front of the inner for loop) or use /O1 for the whole funtion test (add#pragma optimize("", off) / #pragma optimize("", on)) around the function.
Did you see the problem recently only (with acompiler update) or was it existing for longer time?
Hubert.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your answer. I could not tell if this bug is present in earlier versions of the compiler -- I found this bug while attempting to compile with Intel Compiler a project that has always been compiled with Visual C++ compiler so far.
Fixing a function once the bug has been found to affect it is indeed easy. The solutions you propose work fine. I found that changing the code into
[cpp] for (int i = 0; i < 3; ++i) { int i0 = (i < 2 ? i : i+1); for (int j = 0; j < 3; ++j) { M[i+3*j] = A[i0 + 4 * j]; } } [/cpp]works also fine. However my concern is to make sure this bug does not affect other functions silently. If you would have a more specific description of the bug and guidelines to avoid it, that would be great, because if I understand correctly the only safe way right now is to completely disable vectorization or /O2 mode.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
From the coding (and for the autovectorizer/optimizer) perspective it's better anyway to "outsource" the termary operator and assign the value to a temp var and use it in the loop.
Disabling the vectorizer or high optimizer as a workaround should be applied for the respective loops/functions only. Switching them off globally may hurt the overall performance significantly.
But it's definitely a bug in the Intel Compiler; it should calculate the code and optimize/vectorize correctly in any case. I'm going to file a defect.
The optimizer workaroundon function levellooks like:
[cpp]#pragma optimize("", off) void test(double const * A) { double M[9]; for (int i = 0; i < 3; ++i) { for (int j = 0; j < 3; ++j) { M[i+3*j] = A[(i < 2 ? i : i+1) + 4 * j]; } } std::copy(M, M + 9, std::ostream_iterator
The workaround for disabling the vectorizer (on inner loop) looks like:
I hope this helps. I'll let you know once I have news about a bugfix.
Regards,
Hubert.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page