Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
27 Views

Guided Auto Parallelism

Hi,

I ran Guided Auto Parallelism analysis. I see there are 0 remarks in the output window, does it mean there are no changes required in the code for optimization?

Also when I ran the application with the MSVC compiled executable it is taking around 10 hours to complete the execution and where as with Intel C++ compiler it is taking around 21 hours which is doubled the time. I started using the Intel C++ trial version to confirm the performance improvement before buying the licensed version. So with this do I need to end my evaluation or do you have any other suggestions for boosting the performance with Intel C++ compiler.

Thanks,

Pradeep 

0 Kudos
13 Replies
Highlighted
Employee
27 Views

Dear Pradeep,

GAP (Guided Auto Parallelism) is not so powerful feature and I'd suggest you to use Intel Advisor tool which has dozens of
new features to support vectorization (and it also gives advises what to do to for each particular case in the code).
Also, please use an improved in the latest versions Optimization Report (-opt-report option) which has a nice integration to Visual Studio.
I wouldn't recommend to use GAP since it may be quite outdated.
Need to check the compiler options you used for both cases (MSVC and Intel compiler). Could you please provide the compilation command line?

Regards,
Igor

0 Kudos
Highlighted
Beginner
27 Views

Hi,

My C++ application is compatible with VS 2008 and I have installed Intel Composer XE 2013 XP1. Could you share the URL from where I can download Intel Advisor tool that is compatible with VS 2008.

Also attached the compiler settings that were used for both MSVC and Intel C++ compiler.

Thanks,

Pradeep

0 Kudos
Highlighted
Black Belt
27 Views

It seems unlikely that performance with compiler setting /Od (basic debug settings?) would be addressed by GAP or by Advisor.  Advisor is included in current Parallel Studio packages (including the evaluation), so that advice doesn't apply to old compilers such as you mention (either the Intel or Microsoft ones).  You do need debug symbols, e.g. -debug:inline-debug-info (along with release optimization settings) to work with Advisor.

Performance comparison between CL and ICL is not a simple question.  For example, you must set /fp:fast /Ox to see the optimizations in CL which you get with ICL even when you back down to /O2 /fp:source.  In the 32-bit CL, you probably require /arch: settings (along with upgrade to VS2013 or 2015, as required by current ICL).  I would not recommend ICL simply on the basis that the default release settings are more aggressive than in CL.

0 Kudos
Highlighted
Employee
27 Views

As Tim outlined, you have Od option set and it disables all optimizations, so, vectorization is also off. Please set at least O2 level (default for Release configuration). By the way, why don't you use the latest 17.0 version of the compiler (Parallel Studio XE 2017 Composer Edition)?

0 Kudos
Highlighted
Beginner
27 Views

Hi Tim,

I see there is an improvement in performance when I changed optimization setting from /Od to /O3. Could you help me with the options that I need to set for 

1. Add Processor-Optimized Code path. and 

2. Intel Processor-Specific Optimization.

in VS 2008 for Intel(R) Core (TM) i5 processor. I have attached the available options.

Thanks,

Pradeep

 

0 Kudos
Highlighted
Beginner
27 Views

Hi Igor V,

My Application is compatible with VS 2008 only, So does Parallel Studio XE 2017 Composer Edition has VS 2008 plugin?

Thanks,

Pradeep 

0 Kudos
Highlighted
Black Belt
27 Views

QxHost is suitable if you compile and profile on the same CPU.

0 Kudos
Highlighted
Beginner
27 Views

Hi Tim,

Thanks for the prompt reply. Shall I put "None" for this option Add Processor-Optimized Code path?

Regards,

Pradeep

 

0 Kudos
Highlighted
Employee
27 Views

We don't support integration to VS2008 in 2017 version (even to VS2010 is not supported).
Note that /Qaxcode option (Processor-Optimized Code path) tells the compiler to generate multiple, feature-specific auto-dispatch code paths for Intel® processors. The default code path is SSE2, but you can specify more code paths.
/Qxcode option tells the compiler which processor features it may target, including which instruction sets and optimizations it may generate.
It generate only one code path. The option suggested by Tim is /QxHost and it tells the compiler to generate instructions for the highest instruction set available on the compilation host processor.

 

 

0 Kudos
Highlighted
Beginner
27 Views

Hi,

I have enabled Optimization diagnostic phase: The High Performance Optimizer phase (/Qopt-report-phase:hpo), and after compilation I have seen about 478 remarks.

Most of the remarks are having this pattern: 

filename (32:5-32:5):VEC:?BinSearch@@YAXHPAH0QAH@Z:  loop was not vectorized: unsupported loop structure

Could you help me in decoding this error: does it have line no# and any other info? 

I have given another error below, Could you also help with the webpage where all these different types of errors are listed and respective solutions?

filename:(46:9-46:9):VEC:?BinSearch@@YAXHPAH0QAH@Z:  loop was not vectorized: nonstandard loop is not a vectorization candidate

Thanks,

Pradeep

0 Kudos
Highlighted
Black Belt
27 Views

A "standard" do loop would be typically an f77 counted loop. You could post a few of the problem loops if you can't figure it out.
0 Kudos
Highlighted
Beginner
27 Views

Hi,

I got this remark #30535 in couple of lines but in those loops there is no code related to exception handling, Could you explain why this remark is thrown?

Thanks,

Pradeep 

0 Kudos
Highlighted
Black Belt
27 Views

If you refer to "protects exception" it usually means you have a conditional operation which raises questions about optimization.  Among the ways to encourage the compiler to optimize are '#pragma vector always' or omp simd .

0 Kudos