Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5149 Discussions

Vtune Amplifier XE 2013: irrelevant values in "Estimated Call Count" column

Boris_Sunik
Beginner
2,816 Views

I profiled our Application with the Lightweight Hotspots and got completely irrelevant values in the "Estimation Call Count" column

Most of routines were estimated as executed from 100,000,000 till 300,000,000 times despite some of them were actually executed only several times, other several thousand times.

In one more case the data shows 3.5 second execution time and zero data in the "Estimation Call Count"

The Intel Processor is Xeon W3670, System Windows 7

0 Kudos
19 Replies
Peter_W_Intel
Employee
2,816 Views
If estimated call counts were huge for the function which was only called limited times, in top-down tree report. That did make sense, maybe its subroutines was called in high counts - they were included in parent function. If you saw above in buttom-up report, it should be a bug - estimated call counts were contributed to all hot functions, not from parents. Please submit a ticket to https://premier.intel.com, with your test case.
0 Kudos
Boris_Sunik
Beginner
2,816 Views
I am trying of XE Amplifier 2013 The application is not in my product list on the premier support and there is a restriction 50 Mb for downloads while the project zip has 200 mb (1Gb data ) . So I failed to upload the file
0 Kudos
Dmitry_Chichkov
Beginner
2,816 Views
I also gave Amplifier 2013/Linux x64 a quick evaluation, and it looks like Lightweight Hotspots + Stack + Counters are completely broken. With a trivial test case: [cpp] int fA() {int i, j; for(i = 0; i < 100000; i++) j += i % 12345; return j;} int fB() {int i, j; for(i = 0; i < 1000000; i++) j += i % 12345; return j;} int main(char **argv, int argc) { int a,c = 0; for(a = 0; a < 1000; a++) {c += fA(); c += fB();} return c; } [/cpp] Function are present two times in the call three. I'm getting fA/fB counters different and off by an order of magnitude - 9372 / 3124. CPU time zero. Weird wait times. And so on. OS: Linux x64/Ubuntu 11.10; CPU: Core 2 Quad. Built with gcc -O0 -g. Profiled with 1 minute est. run time. Execution time ~7 seconds. Attaching screenshot.
lwhspt.jpg
Best, Dmitry
0 Kudos
Peter_W_Intel
Employee
2,816 Views
Thank you for example code. I verified this on my machine, found two critical issues: (I will update this thread, if any clue/solution found) 1. Missed function main() - caller in the list 2. Call count is zero, that was wrong. # amplxe-cl -version Intel(R) VTune(TM) Amplifier XE 2013 (build 243421) Command Line Tool Copyright (C) 2009-2012 Intel Corporation. All rights reserved. # gcc -g test_callstack.c -o test_callstack # amplxe-cl -collect lightweight-hotspots -knob enable-stack-collection=true -knob enable-call-counts=true -- ./test_callstack # amplxe-cl -report callstacks Using result path `/home/peter/problem_report/r004lh' Executing actions 50 % Generating a report Function Call Stack Module CPU Time:Total CPU Time:Self ----------------------------------- ---------- -------------- -------------- ------------- fB test_callstack 90.58% 3.499 fA test_callstack 9.26% 0.358 do_wp_page vmlinux 0.1% 0.004 do_lookup_x ld-2.5.so 0.02% 0.0007519
0 Kudos
Dmitry_Chichkov
Beginner
2,816 Views
Any updates?
0 Kudos
Dmitry_Chichkov
Beginner
2,816 Views
2 weeks... by chance, any updates from devs?
0 Kudos
Peter_W_Intel
Employee
2,816 Views
Looks like there was vtune driver installation issue on old Linux kernel, but the tool didn't give message... Solution: please enable callstack/call count function on latest Linux OS I tried same steps on another box (redhat-el6), I saw main() function in report. # amplxe-cl -report callstacks|more Using result path `/home/peter/problem_report/r000lh' Executing actions 50 % Generating a report Function Call Stack Module CPU Time:T otal CPU Time:Self -------------------------- ------------------------ -------------- ---------- ---- ------------- __libc_start_main libc-2.12.so 99.59% 0 main test_callstack 99.59% 0 __libc_start_main libc-2.12.so 99.59% 0 fB test_callstack 90.42% 2.634 main test_callstack 90.42% 2.634 __libc_start_main libc-2.12.so
0 Kudos
Dmitry_Chichkov
Beginner
2,816 Views
Intresting. Curious, what fA/fB call counters and fA/fB CPU times are you getting? Incedentaly, my call counters weren't zero, like yours. They were just wrong by an order of magnitude (9400 instead of 1000). Installation was on the x64, Linux 3.0.0-16-server, Ubuntu, Xeon E5345 [Clovertown].
0 Kudos
Peter_W_Intel
Employee
2,816 Views
Here are screen shot of my result.
0 Kudos
Dmitry_Chichkov
Beginner
2,816 Views
Looks like results are more consistent in your case - you are getting similar counter values for fA and fB. But fA call count is 7,935 instead of expected 1000. Any ideas onto how to rectify that?
0 Kudos
Peter_W_Intel
Employee
2,816 Views
Call stacks and call counts are available only starting from Linux* kernel 2.6.28 or later. Please check your system and also notice there is no any warnings about this.
0 Kudos
Dave_G_
Beginner
2,816 Views
I'm trying to run, and don't even get call counts. What am I missing? As background, we just installed XE 2013 Update 2 (build 253325) for Linux on a RHE6.3 machine. Uname -a gives: Linux a5.colo.ucirrus.com 2.6.32-279.14.1.el6.x86_64 #1 SMP Mon Oct 15 13:44:51 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux. I am running as root, as I can't get the driver loaded as me, despite adding my login id to /etc/groups for vtune user (which is all that had to be done for RHE5). For analysis type (lightweight hotspots), I selected collect stacks, estimate call counts, and analyze user tasks. Hardware is a Nehelm / Westmere, from /proc/cpuinfo, I have the following: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 47 model name : Intel(R) Xeon(R) CPU E7- 4860 @ 2.27GHz stepping : 2 cpu MHz : 2261.178 cache size : 24576 KB physical id : 0 siblings : 20 core id : 0 cpu cores : 10 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt aes lahf_lm arat dts tpr_shadow vnmi flexpriority ept vpid bogomips : 4522.35 clflush size : 64 cache_alignment : 64 address sizes : 44 bits physical, 48 bits virtual power management: System has 40 cores (20 + hyperthreading). When I launch the capture, I start paused, and when our program is at the correct location to start performance analysis, I start it, capture for about 10 seconds, and then stop it (in vTune gui). The command that vTune gui claims to execute is "amplxe-cl -collect nehalem-general-exploration -knob enable-stack-collection=true -- /home/daveg/SVN/trunk/bin/Release/_pvm". _pvm is built with the Intel c++ compiler. I'm attaching the top part of the GUI output.
0 Kudos
Peter_W_Intel
Employee
2,816 Views
I got confused that your screen-shot displays lightweight-hotspots, but you said to use nehalem-general-exploration. So, use amplxe-cl -collect lightweight-hotspots -knob enable-stack-collection=true -knob enable-call-counts=true -- /home/daveg/SVN/trunk/bin/Release/_pvm Note that you have to add "-g" option to generate debug info when building "_pvm"
0 Kudos
Kyung_Seok_L_
Beginner
2,816 Views
I have same problem as what Dave G has. I'm trying to run, and don't get "estimated call counts". I installed XE 2013 (build 261256) for Linux on a Ubuntu 10.04 machine. I am running as root. For analysis type (lightweight hotspots), I selected collect stacks, estimate call counts, and analyze user tasks. there is a capture of the GUI output attached. The command that vTune gui claims to execute is "amplxe-cl -collect lightweight-hotspots -knob enable-stack-collection=true -knob enable-call-counts=true -- /home/jisung/test/test". test is built with the gcc with debug option. cpu name : 3rd generation intel(R) core(TM) processor family can anyone help me ??
0 Kudos
Peter_W_Intel
Employee
2,816 Views
Is it possible that "estimated call count" column is invisible in right you need to scroll-right? Not sure if you worked on old OS, that estimated call stack and estimated call count are not supported. Please try this feature on latest OS. Also check " lsmod | grep vtsspp" to ensue the driver has been installed.
0 Kudos
Kyung_Seok_L_
Beginner
2,816 Views
Thank you for your reply, Perter. My os is Ubuntu 12.04.1 LTS, which is quite latest, I checked "lsmod | grep vtsspp" to check out the drivers. What else should i check? to see "estimated call count"...
0 Kudos
Peter_W_Intel
Employee
2,816 Views
Kyung_Seok L. wrote:

Thank you for your reply, Perter.

My os is Ubuntu 12.04.1 LTS, which is quite latest,

I checked "lsmod | grep vtsspp" to check out the drivers.

What else should i check? to see "estimated call count"...

No other thing to do, just add options "-knob enable-stack-collection=true -knob enable-call-counts=true" in amplxe-cl, or enable them on GUI. If you still cant see call count, submit your results to https://premier.intel.com for investigating.
0 Kudos
Kyung_Seok_L_
Beginner
2,816 Views

I recently find out why I didn't get Estimated call count infomation.

The problem was the code.

I used test code which was very short and Vtune wasn't able to figure out call count info from the test code.

When I profiled with my project, there was no problem.

If anyone gets this problem, try with long one.


By the way,

Is there any way to collect "Estimated call count" with amplxe-cl?

I looked at

http://software.intel.com/sites/products/documentation/hpc/amplifierxe/en-us/2011Update/lin/ug_docs/GUID-D45A7CB9-C2BF-472B-8E65-00C9305538ED.htm

and can not find the way to collect call counts.

0 Kudos
Peter_W_Intel
Employee
2,816 Views

No sample captured, no call stack info - which was called "statistical call stack" info.

Use command line, for example:

amplxe-cl -collect lightweight-hotspots -knob enable-stack-collection=true -knob enable-call-counts=true -- target-app

0 Kudos
Reply