Information on interpreting floating point events and ratios?

croucar1 · ‎11-19-2008

I'm starting to investigate an old (F77/VAX) acoustic code that is supposed to be floating point intensive. So far I've found that the FP instruction ratio is 0.08, so maybe it isn't HPC code anymore. Based on the performance issues of the day, it is packed full of branches designed to minimize math ;-)

I haven't been able to find a guide or even good example forinterpretingprofile datarelated tofloating point performance. Can anybody point to one?

Thanks!

TimP · ‎11-19-2008

Quoting - croucar1@jhuapl.edu

I'm starting to investigate an old (F77/VAX) acoustic code that is supposed to be floating point intensive. So far I've found that the FP instruction ratio is 0.08, so maybe it isn't HPC code anymore. Based on the performance issues of the day, it is packed full of branches designed to minimize math ;-)

I haven't been able to find a guide or even good example forinterpretingprofile datarelated tofloating point performance. Can anybody point to one?

Thanks!

The biggest down side of legacy code which puts in branches to skip unnecessary operations is the way it prevents mult-level loop optimization and vectorization. At the initial stage, you need profiling only to show you where the code spends enough time toneed fixing. You don't need to figure out how much time is spent on events which will be changed entirely when you fix the code.

Some suchcodes were writtenwith the goalof preventingthem from running faster on a compiler or machine different from the author's original favorite, rather than optimizing in general.

croucar1 · ‎11-19-2008

Quoting - tim18

The biggest down side of legacy code which puts in branches to skip unnecessary operations is the way it prevents mult-level loop optimization and vectorization. At the initial stage, you need profiling only to show you where the code spends enough time toneed fixing. You don't need to figure out how much time is spent on events which will be changed entirely when you fix the code.

Some suchcodes were writtenwith the goalof preventingthem from running faster on a compiler or machine different from the author's original favorite, rather than optimizing in general.

My problem is primarily marketing - Iunderstand that the code is a dog because it targets an obsolete architecture. (Pre-cache, pre-threads, pre-SSE, slow ALU...) I need to collect the data to convince theoriginal developers, now upper-level bosses, that their code doesn't run nearly as well as they remember. And explain why in gory detail. Otherwise, no money and no fix. (These bosses rejected one of my proposals with the statement that GPGPUs would "violate Moore's Law")

But just the Tuning Advice that floating point ops aren't an importantcomponent of the run timemight help. ;-)