Community
cancel
Showing results for 
Search instead for 
Did you mean: 
150 Views

How should I do optimizations that speed and memory access cycles associated with Intel C++ compiler

Jump to solution

My program have include a lot of loop with memory access. Now I use optimization that is O2 Maximize Speed. But should I use O3 Highest optimization instead. Also what else can I do adjustments

0 Kudos
1 Solution
TimP
Black Belt
150 Views
-O3 adds mainly optimizations for multiple level loops, at possible expense of increased size of generated code. You could see what is added for your application by comparing compiler reports e.g. -Qopt-report-file=source.txt -Qopt-report4. Those reports are invaluable to show existence and nature of compiler optimizations applied to your critical loops. The exact meaning of the numeric suffix on opt-report varies with compiler version. As you've no doubt read elsewhere, you should start by analysis to determine the location and nature of any performance bottlenecks. ICL offers -Qprofile-loops option for such purposes. recent VTune profiles have made great improvements in the general analysis category.

View solution in original post

6 Replies
TimP
Black Belt
151 Views
-O3 adds mainly optimizations for multiple level loops, at possible expense of increased size of generated code. You could see what is added for your application by comparing compiler reports e.g. -Qopt-report-file=source.txt -Qopt-report4. Those reports are invaluable to show existence and nature of compiler optimizations applied to your critical loops. The exact meaning of the numeric suffix on opt-report varies with compiler version. As you've no doubt read elsewhere, you should start by analysis to determine the location and nature of any performance bottlenecks. ICL offers -Qprofile-loops option for such purposes. recent VTune profiles have made great improvements in the general analysis category.

View solution in original post

150 Views

I'm using VTune already. But where is the optimizations reports ? and how to use -Qprofile-loops option ?

 

TimP
Black Belt
150 Views

If you use Visual Studio GUI I suppose you must add the opt-report and profile-loops options in the additional command line options, and perhaps examine the results in a text editor.

If you want the opt-report results to appear in your build log, of course you will omit the opt-report-file option, but in my opinion it will be more difficult to compare to view the effect of changing your compile options and source code.

Are you trying to get by without the user guide?

150 Views

I examining user guide. I add /Qprofile-loops:all and /Qopt-report-file:$(IntDir)$(TargetName).rep  in compiler command lines. I setup following way

a.PNG

I have a file that ParallelSearch.rep but I did not see any log for profile-loops and diagnostic file .

.diag

icl: command line warning #10333: Loop profiler cannot be used when generating parallel code. Disabling '/Qprofile-loops'

.rep

<;-1:-1;IPO UNREFERENCED VAR REMOVING;;0>
  UNREF VAR REMOVAL ROUTINE-SYMTAB (....)

  UNREF VAR REMOVAL ROUTINE-SYMTAB (....)

  UNREF VAR REMOVAL ROUTINE-SYMTAB (....)

  UNREF VAR REMOVAL ROUTINE-SYMTAB (_main):VARS(8),PACKS (8)

 

I did not understand anything. What needs to be analyzed to ? 

TimP
Black Belt
150 Views
It's probably good to begin profiling and vectorization optimizations with threaded parallelization off. As you're using VTune, you probably don't need the profile-loops, but it's easy to be misled when starting out in VTune with parallelization. Learning the opt-report stuff is particularly important with parallelization.
QIAOMIN_Q_
New Contributor I
150 Views

As the warning says 'when generating parallel code. Disabling '/Qprofile-loops'' ,since instrumentation calls inserted at a function's entry and exit points, and before and after instrumentable loops may not work well in parallel context and make it's hard to get analyzed.

Reply