I have trouble accurately assessing the time spend in vital classes and functions in our code. The program is compiled in VC++6 and runs multi-threaded on a 2 CPU, dual core Dell 690.
The call graph option "Total Time" seems to be exactly what I want, but profiling based on call graphs slows my program down immensly. That is, execution time of the whole program goes from 74 s to 589 s, which makes the time spend in the individual functions also inaccurate. Is this normal behaviour and if so, is there another manner in which to get this data accurately?
The VTune analyzer attempts to subtract the overhead from the timing information for each function. The real usage model for the call graph time, however, is for comparison purposes between functions within the data collection.
Sampling is a very low overhead data collection mechanism, but it will not give you calling relationships. That information is costly to collection and will always add overhead.