Solved: Lifetime of PGO profile data

racker · ‎01-28-2009

Hi all,

when using profile guided optimization, I was wondering how long the data gathered from a prof-gen run is valid.

My Problem:
First I compile my application and do a _lengthy_ automated test run for gathering profiling information.
Then I recompile using prof-use.

What happens if I want to add changes to my application and recompile using prof-use without doing the lengthy profiling again? Is that possible? If yes, what kind of changes will make the profiling data unusable? Are there any diagnostics if PGO fails because of invalid profiles?

I did some test runs and the compiler isn't complaining. Can't say anything about the impact on performance, though.

Any ideas? Thanks!

TimP · ‎01-28-2009

Changes in a function which modify the line numbering are likely to invalidate the profile data for that function. Normally, the compiler would act as if no profile data are found for the function, if it doesn't find matching data. Current compilers have been designed to do well with "static profiling" (no prof-gen data). You can see profiling assumptions in comments, if you generate the asm file.
prof-gen/prof-use has had reduced emphasis since the advent of SPEC 2006, where it is not permitted for base performance quotations. Likewise, optimization for IA64, where PGO would have been necessary, had become archaic by then.

View solution in original post

TimP · ‎01-28-2009

Changes in a function which modify the line numbering are likely to invalidate the profile data for that function. Normally, the compiler would act as if no profile data are found for the function, if it doesn't find matching data. Current compilers have been designed to do well with "static profiling" (no prof-gen data). You can see profiling assumptions in comments, if you generate the asm file.
prof-gen/prof-use has had reduced emphasis since the advent of SPEC 2006, where it is not permitted for base performance quotations. Likewise, optimization for IA64, where PGO would have been necessary, had become archaic by then.

racker · ‎01-29-2009

tim18,

that answers my question, thank you.

I still wonder why Intel would adjust its strategy so much to CPU benchmarks. With the latest Intel compiler I see a significant performance gain when combining IPO with PGO for our applications. Therefore PGO has obviously a higher potential than "static profiling" alone.

In my opinion the consequence must be to modify the SPEC to reflect such impact on real application performance instead of discontinuing PGO.

TimP · ‎01-29-2009

Quoting - racker

tim18,

that answers my question, thank you.

I still wonder why Intel would adjust its strategy so much to CPU benchmarks. With the latest Intel compiler I see a significant performance gain when combining IPO with PGO for our applications. Therefore PGO has obviously a higher potential than "static profiling" alone.

In my opinion the consequence must be to modify the SPEC to reflect such impact on real application performance instead of discontinuing PGO.

I am curious to know in which situations PGO is helping. I suppose the guesses the compiler makes prior to PGO must be wrong.
For example, the default assumption, when no clues are present, is to assume loop count 100. It should be possible to set #pragma loop count in order to optimize for shorter or longer loops.
PGO also may be used to optimize for one particular case, in code which includes branches for many cases,