Intel® C++ Compiler
Support and discussions for creating C++ code that runs on platforms based on Intel® processors.

What did PGO do?

I have some static code that I've been tuning as much as I can, trying all sorts of things. The fastest I could get was 9.5 seconds. When I ran the program multiple times with prof-gen, and then used prof-use, the code is now down to 8.2 seconds. Wow.
Is there any way I can find out what it did? Any hints/reports? I've looked through the assembler but would like some hints at producing the source code in the first place.
0 Kudos
1 Reply
With the knowledge of program runtime execution behavior PGO (Profile Guided Optimization) is able to do a number of optimizations such as:

- Better data & code layout
- Frequently accessed code placed next together
- Better instruction cache usage & fetching
- Improved branch prediction - good for branchy apps/large switch blocks (e.g. moving the most frequently taken branch higher up in the block).
- Better function inlining (e.g. inline hot functions, not cold ones)

You could generate a PGO optimization report (using -opt-report, -opt-report-phase options) and see what kind of transformation took place. Use "pgo" for the opt-report phase. Use -opt-report-help to get a list of various optimzation phases that you can get report on.

You could also add -O3, and then -ipo to the PGO options and see if you get more performance.