Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5098 Discussions

Total Elapse Time (TET) value

srimks
New Contributor II
1,226 Views
Hi.

I am using Vtune v-9.1. While going for "FIRST USE WIZARD" option, after passing the application, exe & arguements, I do get "Total Elapsed Time" value or Benchmark value alongwith proper - module, function & process utilization percentage.

But when I try again the same thing to perform FIRST USE WIZARD without modifying the source code, I get different "Total Elapsed Time" or Benchmark value, why I am getting the different TEL at each run, rather one should get constant TEL value.
Any clue to get constant Total Elapsed Time (TET) value or Benchmark value?

~BR
0 Kudos
7 Replies
Peter_W_Intel
Employee
1,226 Views

Hi,

I wonder if you run sample application in two sessions, it should have different time elapsed - especially for tiny application, e.g. try "time ./vtunedemo". The reason is that operation system has different resoure and other processesimpacts on your application running in two sessions.

That also will impacts on VTune Analyzer results, I think. You may try other big workload application,the difference ofresultsshould be minimal.

Regards, Peter
0 Kudos
srimks
New Contributor II
1,226 Views

Hi,

I wonder if you run sample application in two sessions, it should have different time elapsed - especially for tiny application, e.g. try "time ./vtunedemo". The reason is that operation system has different resoure and other processesimpacts on your application running in two sessions.

That also will impacts on VTune Analyzer results, I think. You may try other big workload application,the difference ofresultsshould be minimal.

Regards, Peter

Hi.

I tried a big workload, namely Autodock4 using "First Use Wizard". I tried having TET by running once, than killed or used option "X" on the module/process information window to close it on the top of RHS and than performed "delete" on bottom LHS for same workspace to be deleted using cursor.

Again, loaded same Autodock4 exe. by clicking yellow colored "plus" and using the same applications & arguements, but it gives TET with a difference of 3% - 5% which I think is probably big number when effective & reliable profiling is needed.

What should I do to have almost no difference in TET with succesive runs of "First Use Wizard" for AutoDock4?

~BR
0 Kudos
Peter_W_Intel
Employee
1,226 Views
Quoting - srimks

Hi.

I tried a big workload, namely Autodock4 using "First Use Wizard". I tried having TET by running once, than killed or used option "X" on the module/process information window to close it on the top of RHS and than performed "delete" on bottom LHS for same workspace to be deleted using cursor.

Again, loaded same Autodock4 exe. by clicking yellow colored "plus" and using the same applications & arguements, but it gives TET with a difference of 3% - 5% which I think is probably big number when effective & reliable profiling is needed.

What should I do to have almost no difference in TET with succesive runs of "First Use Wizard" for AutoDock4?

~BR

As I said before, it depends on system environment and other active applications, even VTune GUI will interfere the result, but I think that 3%-5% difference is acceptive

Have you tried /opt/intel/vtune/samples/vtunedemo/vtunedemo with "First Use Wizard"? I ran it twice, 1st result is 0.6s, and second result is 0.9s. That is why I said 3%-5% difference is acceptive.

Regards, Peter
0 Kudos
srimks
New Contributor II
1,226 Views

As I said before, it depends on system environment and other active applications, even VTune GUI will interfere the result, but I think that 3%-5% difference is acceptive

Have you tried /opt/intel/vtune/samples/vtunedemo/vtunedemo with "First Use Wizard"? I ran it twice, 1st result is 0.6s, and second result is 0.9s. That is why I said 3%-5% difference is acceptive.

Regards, Peter
Hi

How does I take care to avoid "system environment & active applications, also VTune GUI interferences".

Probably, if my application TET is ~ 218 sec or some larger time, than also 3% - 5% can't be acceptive, coz 3% - 5% can improve or scale down TET to almost 204 sec which is still better than 218 sec.

Can use of RDTSC give me accurate results, excluding system & other inferences.

~BR
0 Kudos
Peter_W_Intel
Employee
1,226 Views
Quoting - srimks
Hi

How does I take care to avoid "system environment & active applications, also VTune GUI interferences".

Probably, if my application TET is ~ 218 sec or some larger time, than also 3% - 5% can't be acceptive, coz 3% - 5% can improve or scale down TET to almost 204 sec which is still better than 218 sec.

Can use of RDTSC give me accurate results, excluding system & other inferences.

~BR

Hi,

You can close other non-related applications in system environment, then use VTune Performance Analyzer, either use VTL command instead of use VTune GUI (finally import VTL results to VTune GUI for analysis). See "man vtl". Also I suggest you to CPI = CPU_CLK_UNHALED.CORE / INST_RETIRED.ANY to know cycles for each instruction for your module, functions - to measure performance and its improved.

Using RDTSC will not exclude other inferences in the system.

Regards, Peter
Using
0 Kudos
srimks
New Contributor II
1,226 Views
Quoting - srimks
Hi

How does I take care to avoid "system environment & active applications, also VTune GUI interferences".

Probably, if my application TET is ~ 218 sec or some larger time, than also 3% - 5% can't be acceptive, coz 3% - 5% can improve or scale down TET to almost 204 sec which is still better than 218 sec.

Can use of RDTSC give me accurate results, excluding system & other inferences.

~BR

Hi,

You can close other non-related applications in system environment, then use VTune Performance Analyzer, either use VTL command instead of use VTune GUI (finally import VTL results to VTune GUI for analysis). See "man vtl". Also I suggest you to CPI = CPU_CLK_UNHALED.CORE / INST_RETIRED.ANY to know cycles for each instruction for your module, functions - to measure performance and its improved.

Using RDTSC will not exclude other inferences in the system.

Regards, Peter
Using
Peter,

You did mention "You can close other non-related applications in system environment, then use VTune Performance Analyzer, either use VTL command instead of use VTune GUI (finally import VTL results to VTune GUI for analysis)."

Could you extend more with steps - What has to be done using VTune CUI (command line user interface) command for getting the elapse time only for section of code or a file within a file or multiple files respectively. I am currently using VTune - 9.1(update 2-226) on linux x86_64 m/c.

I wish to exclude all system instruementation as done by VTune and only prefer to have profling for a section of code or maybe a block within a file.

~BR


0 Kudos
Peter_W_Intel
Employee
1,226 Views
Quoting - srimks
Quoting - srimks
Hi

How does I take care to avoid "system environment & active applications, also VTune GUI interferences".

Probably, if my application TET is ~ 218 sec or some larger time, than also 3% - 5% can't be acceptive, coz 3% - 5% can improve or scale down TET to almost 204 sec which is still better than 218 sec.

Can use of RDTSC give me accurate results, excluding system & other inferences.

~BR

Hi,

You can close other non-related applications in system environment, then use VTune Performance Analyzer, either use VTL command instead of use VTune GUI (finally import VTL results to VTune GUI for analysis). See "man vtl". Also I suggest you to CPI = CPU_CLK_UNHALED.CORE / INST_RETIRED.ANY to know cycles for each instruction for your module, functions - to measure performance and its improved.

Using RDTSC will not exclude other inferences in the system.

Regards, Peter
Using
Peter,

You did mention "You can close other non-related applications in system environment, then use VTune Performance Analyzer, either use VTL command instead of use VTune GUI (finally import VTL results to VTune GUI for analysis)."

Could you extend more with steps - What has to be done using VTune CUI (command line user interface) command for getting the elapse time only for section of code or a file within a file or multiple files respectively. I am currently using VTune - 9.1(update 2-226) on linux x86_64 m/c.

I wish to exclude all system instruementation as done by VTune and only prefer to have profling for a section of code or maybe a block within a file.

~BR



Hi,

I meant to close other active applications, your application will utilize the processor better, but OS modules still will be active.Sampling data collection will collect all active modulesin systen wide, including your app.

VTune Analyzerhas online help to interpret VTL command use, also you can use"man vtl" to know this.

If youare interested of perfromance data on your application only, you can use call graph data collection.

Regards, Peter
0 Kudos
Reply