Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Mona_Ezz
Beginner
64 Views

Optimum Time values for the best performance

What are the optimum time values that I have to reach to achieve best performance?
After running call graph on the application, I get the functions and its running time; I want to know, what is the target that I should reach for the best performance of the application
As an Example If I have a function that takes Total Time = 10,487,379, I want to know if this time needs to be minimized or its normal and according to what criteria.

and also could you provide uswith good materiales for Vtune performance
0 Kudos
4 Replies
TimP
Black Belt
64 Views

The Knowledge Bases > Product Documentation links on this forum page should keep you busy for a while.
VTune isn't very good at setting your goals, only at showing you where opportunities exist.
Peter_W_Intel
Employee
64 Views

Quoting - tim18
The Knowledge Bases > Product Documentation links on this forum page should keep you busy for a while.
VTune isn't very good at setting your goals, only at showing you where opportunities exist.

Thanks for Tim's comments.

Additionally Call graph data collection helps the user to find Critical Path which spends most of CPU time. In the critical path,the user can find biggest"Self Time"on specific function,and know if the function was called frequently by parents function (based on "call count" data) - then make a decision ifneed to adjustthe algorithm (For example -a. If all function calls are necessary, can the user reduce some calls? b. Can the user change serial code to parallel code?).

Tim is right - Call graph's resultshows you where opportunities exist.

Secondary Sampling data collction's resultprovides the user CPI (Cycles per Instruction) value for each module/function. We hope CPI valueas smaller as better, usually .75 is accepted (if there is no FP instruction,SSE instructions in function) - see http://www.intel.com/software/products/documentation/vlin/mergedprojects/analyzer_ec/mergedprojects/...

So the user may improve onhotsource line to let CPI value smaller than original.

Hope it helps.

Regards, Peter
Mona_Ezz
Beginner
64 Views


Thanks for Tim's comments.

Additionally Call graph data collection helps the user to find Critical Path which spends most of CPU time. In the critical path,the user can find biggest"Self Time"on specific function,and know if the function was called frequently by parents function (based on "call count" data) - then make a decision ifneed to adjustthe algorithm (For example -a. If all function calls are necessary, can the user reduce some calls? b. Can the user change serial code to parallel code?).

Tim is right - Call graph's resultshows you where opportunities exist.

Secondary Sampling data collction's resultprovides the user CPI (Cycles per Instruction) value for each module/function. We hope CPI valueas smaller as better, usually .75 is accepted (if there is no FP instruction,SSE instructions in function) - see http://www.intel.com/software/products/documentation/vlin/mergedprojects/analyzer_ec/mergedprojects/...

So the user may improve onhot source line to let CPI value smaller than original.

Hope it helps.

Regards, Peter


thanks Tim and Peter for your helpful tips

but I want to ask about CPI value (0.75) is applied on the smallest function on the module or the module itself

as an example

if i have a function that calls other functions and the CPI value of the entire functions are smaller than (0.75) but the module /function itself is exceed this number.

Peter_W_Intel
Employee
64 Views

Quoting - Mona Ezz


thanks Tim and Peter for your helpful tips

but I want to ask about CPI value (0.75) is applied on the smallest function on the module or the module itself

as an example

if i have a function that calls other functions and the CPI value of the entire functions are smaller than (0.75) but the module /function itself is exceed this number.


Sampling data collectionis not like Call graph data collection. CPI value is only for function itself - doesn't include its subroutines, so you have to inspect CPI high value in other functions, which causes your module's CPI value high.

Regards, Peter
Reply