- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have used Intel C++ Compiler 11.0 to compiler our code on Linux and it works fine. The compiler is able to auto-parallelize, vectorize and also parallelize OpenMP parts successfully. Now I have ported the code to Windows and using MS Visual Studio 2008 to compile our code using Intel C++ Compiler 11.0 for Windows. On Windows, however, OpenMP parts are parallelized successfully, but it seems that auto-parallelization is not active. I call the compiler with these compiler options:
/c /O3 /Og /Ob2 /Ot /Qipo /GA /EHsc /RTC1 /MT /GS /fp:fast=2 /Fo"x64\\Release/"
/W1 /nologo /Qopenmp /Qfp-speculation:fast /Qparallel
/Quse-intel-optimized-headers /Qprof_gen /Qprof_dir "x64\\Release"
/Qopt-report-file:"C:\\Documents and Settings\\Sophia\\My Documents\\Visual Studio
2008\\Projects\\CBSM\\opt.txt" /Qopenmp-lib:compat
I should also say that the compiler on Windows could not detect OpenMP pragmas until I added /Qopenmp-lib:compat. As I said, there are many loops that are auto-parallelized and vectorized using auto-parallelization feature in Linux, but same code is not auto-parallelized on Windows. Besides, I found out that there are two features in Intel Visual Fortran for Windows which are "High Performance Parallel Optimizer (HPO)" and "Automatic Vectorizer". Are they also included in Intel C++ Compiler on Winodws and/or Linux?
Thanks
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The HPO and auto-vectorization are also in the Intel C++ Compiler for Windows & Linux & Mac OS.
The differences you saw with Windows and Linux may or may not be a bug. Please provide a testcase. If it is, we can fix it.
Thanks,
Jennifer
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks,
D.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I see.
From the option list, you have /Qprof-gen. It's used for profiling the code and it will disable optimizations because the intention is to use /Qprof-uselater.
First of all, I'd recommend you to upgrade to 11.1.065.
About the auto-vectorization, /arch:SSE2 is the default in 11.x release. So you should get the auto-vectorization with SSE2 instructions. But ifyou want to target new processors or any specific processor with Intel SSE3 or SSE4, you can use followings (11.1 release):
. /arch:[IA32,SSE2,SSE3,SSSE3,SSE4.1]
. /Qax[SSE2,SSE3,SSSE3,SSE4.1,SSE4.2,AVX]
. /Qx[SSE2,SSE3,SSSE3,SSE4.1,SSE4.2,AVX,SSE3_ATOM]
See this article for more info about those: http://software.intel.com/en-us/articles/performance-tools-for-software-developers-intel-compiler-options-for-sse-generation-and-processor-specific-optimizations/
So you could use:
/c /O3 /Og /Ob2 /Ot /Qipo /GA /EHsc /RTC1 /MT /GS /fp:fast=2 /Fo"x64\Release/"
/W1 /nologo /Qopenmp /Qfp-speculation:fast /Qparallel
/Quse-intel-optimized-headers /Qprof_gen /Qprof_dir "x64\Release"
/Qopt-report-file:"C:\Documents and Settings\Sophia\My Documents\Visual Studio
2008\Projects\CBSM\opt.txt" /Qopenmp-lib:compat
==>
/c /O3 /Og /Ob2 /Ot /Qipo /GA /EHsc /RTC1 /MT /GS /fp:fast=2 /Fo"x64\Release/"
/W1 /nologo /Qopenmp /Qfp-speculation:fast /Qparallel
/Quse-intel-optimized-headers /Qprof_use /Qopt-report-file:"C:\Documents and Settings\Sophia\My Documents\Visual Studio
2008\Projects\CBSM\opt.txt" /Qopenmp-lib:compat /QxSSE2
/O3 is the high level optimization provided by Intel Compiler.
Jennifer
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. Use auto-parallelization feature.
2. Report correctly which loops are parallelized/vectorized using OpenMP or auto-parallelization feature.
3. Auto-vectorization gets activated and let me know how.
4. High Performance Optimizer gets activated and let me know how you used it.
By the way, I have upgraded to 11.1 version but nothing changed and I still can not use auto-parallelization features.
Thanks,
D.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
About the loop1 (omp), it can be simplified like following:
test1
[cpp] // Loop #1 #pragma omp parallel for reduction (+:nSum1) for (i=nStart; i<=100*nEnd; ++i) nSum1+=i;[/cpp]About the loop#2 and loop #3, if you change the code like below, the loop will be auto-parallelized. This is a bug and I'll file a ticket for it.
int __inline getsum(int i)
{
int nSum=0;
nSum+=(int)sqrt(cos((sqrt(i*1.22234)*2.445)));
.......
return nSum;
}
int main()
{
...
// Loop #2
int k;
for (i=0; i<=100*100000; ++i)
for (j=0; j<=10000; ++j)
nSum2 += getsum(i);
...
}
To see if the loop is auto-parallelized, use option: /Qparallel /Qpar-report3.
To see if a loop is auto-vectorized, use option: /Qvec-report3. The auto-vectorization is enabled with those options: /arch:[IA32|SSE2|SSE3|SSE4...], /Qax[...], /Qx[...]
Please refer to this article for more details on targeting different architectures.
The Intel C++ compiler has a feature "parallel lint" that can diagnoses existing and potential issues with OpenMP paralleization and the option is "/Qdiag-enable:sc-parallel
To use HLO, usee -O3. You can see more detail with /Qopt-report.
Thanks,
Jennifer
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
/fp:source and the like disable certain auto-vectorizations. I don't know whether sum reduction vectorizations, such as those mentioned in this thread, are disabled only for float data types.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, Tim is right.
Jennifer
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This bug is fixed in 14.0 and 15.0. The 2nd and 3rd loops can all be auto-parallelized now.
Jennifer

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page