Optimum Time values for the best performance

Mona_Ezz · ‎07-26-2009

What are the optimum time values that I have to reach to achieve best performance?
After running call graph on the application, I get the functions and its running time; I want to know, what is the target that I should reach for the best performance of the application
As an Example If I have a function that takes Total Time = 10,487,379, I want to know if this time needs to be minimized or its normal and according to what criteria.

and also could you provide uswith good materiales for Vtune performance

TimP · ‎07-26-2009

The Knowledge Bases > Product Documentation links on this forum page should keep you busy for a while.
VTune isn't very good at setting your goals, only at showing you where opportunities exist.

Peter_W_Intel · ‎07-26-2009

Quoting - tim18

The Knowledge Bases > Product Documentation links on this forum page should keep you busy for a while.
VTune isn't very good at setting your goals, only at showing you where opportunities exist.

Thanks for Tim's comments.

Additionally Call graph data collection helps the user to find Critical Path which spends most of CPU time. In the critical path,the user can find biggest"Self Time"on specific function,and know if the function was called frequently by parents function (based on "call count" data) - then make a decision ifneed to adjustthe algorithm (For example -a. If all function calls are necessary, can the user reduce some calls? b. Can the user change serial code to parallel code?).

Tim is right - Call graph's resultshows you where opportunities exist.

Secondary Sampling data collction's resultprovides the user CPI (Cycles per Instruction) value for each module/function. We hope CPI valueas smaller as better, usually .75 is accepted (if there is no FP instruction,SSE instructions in function) - see http://www.intel.com/software/products/documentation/vlin/mergedprojects/analyzer_ec/mergedprojects/reference_olh/pentiumm_hh/adviceb_hh/poor_cpi.htm

So the user may improve onhotsource line to let CPI value smaller than original.

Hope it helps.

Regards, Peter

Mona_Ezz · ‎07-28-2009

Quoting - Peter Wang (Intel)

Thanks for Tim's comments.

Additionally Call graph data collection helps the user to find Critical Path which spends most of CPU time. In the critical path,the user can find biggest"Self Time"on specific function,and know if the function was called frequently by parents function (based on "call count" data) - then make a decision ifneed to adjustthe algorithm (For example -a. If all function calls are necessary, can the user reduce some calls? b. Can the user change serial code to parallel code?).

Tim is right - Call graph's resultshows you where opportunities exist.

Secondary Sampling data collction's resultprovides the user CPI (Cycles per Instruction) value for each module/function. We hope CPI valueas smaller as better, usually .75 is accepted (if there is no FP instruction,SSE instructions in function) - see http://www.intel.com/software/products/documentation/vlin/mergedprojects/analyzer_ec/mergedprojects/reference_olh/pentiumm_hh/adviceb_hh/poor_cpi.htm

So the user may improve onhot source line to let CPI value smaller than original.

Hope it helps.

Regards, Peter

thanks Tim and Peter for your helpful tips

but I want to ask about CPI value (0.75) is applied on the smallest function on the module or the module itself

as an example

if i have a function that calls other functions and the CPI value of the entire functions are smaller than (0.75) but the module /function itself is exceed this number.

Peter_W_Intel · ‎07-28-2009

Quoting - Mona Ezz

thanks Tim and Peter for your helpful tips

but I want to ask about CPI value (0.75) is applied on the smallest function on the module or the module itself

as an example

if i have a function that calls other functions and the CPI value of the entire functions are smaller than (0.75) but the module /function itself is exceed this number.

Sampling data collectionis not like Call graph data collection. CPI value is only for function itself - doesn't include its subroutines, so you have to inspect CPI high value in other functions, which causes your module's CPI value high.

Regards, Peter