Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5106 Discussions

Clocktick issues in two xeon cpu system.

mfcking
Beginner
944 Views
Hi,
I am running RH Linux EL 4.0 with two xeon cpus. After the sampling is done, I added the clockticks of cpu0 to the clockticks of cpu1 and find the sum is much higher than total clockticks. The same thinghappens to the L2 load miss rate. Any idea about this problem?
Thanks a lot,
Liang
0 Kudos
8 Replies
jeffrey-gallagher
944 Views
Lots of questions for this one, but the biggies arewhat version of VTune are you using, are you using the linux remote data collector or the command line or the GUI, and,is EL4 listed in the release notes for your product under supported OS versions?
Other questions that pop into mind: Kernel version? is call graph working?
Does the installation log show any installation related failures? default location:
/opt/log/install.log
0 Kudos
mfcking
Beginner
944 Views
We use remote data collector.
VTune(Windows side): 7.2 with the latest patches.
VTune (Linux side): 3.0.
RDCfor Linux kernel 2.6with pataches for RedhatEL 4.0.
Kernel version: 2.6.9.
Call grah does work.
Installation is successful.
The incorrect data for some function in kernel module are shown as follows (we created bi-directional smart flow using smartbit 600.All interrupts are routed only to CPU0and no interrupts go to CPU1:
Clockticks Total %(42): 35.64
Clockticks Processor0 %(42): 0.47
Clockticks Processor1 %(42): 99.66
Instructions Retired Total %(42): 0.08
Instructions Retired Processor0 %(42): 0.00
Instructions Retired Processor1 %(42): 33.33
BTW, can anyone tell me what is the meaning of 42 in the parenthesis?
Thanks,

Message Edited by mfcking@yahoo.com on 06-07-2005 03:44 PM

Message Edited by mfcking@yahoo.com on 06-07-2005 03:50 PM

0 Kudos
jeffrey-gallagher
944 Views
Good list. Let's see if anybody out there reads and hears a bell, reporting back to us.
In the meanwhile, to be sure, I think it's important to view the install.log file, even if during the run of install.sh you got a "successful" message at the end.
# vi /var/log/install.log
look for any failures. All successes?
0 Kudos
mfcking
Beginner
944 Views
Yeah, all successes.
0 Kudos
guillermo_marcus
Beginner
944 Views
Hi Liang,

The (42) is the Activity ID.

The clockticks are normally associated to an internal counter in the processor, so each processor keeps its own clocktick count. Is your application consuming the clockticks or another process in the same CPU? Can you provide more information?

Best Regards,
GM
0 Kudos
mfcking
Beginner
944 Views

Hi, Thanks for your reply. Actually we are profiling network card driver instead of some specific applicaiton. We have two NICs. Sometimes I also found the total clock ticks is equal to the average of CPU0 ticks and CPU1 ticks. But sometimes it doesn't.

BTW, why is this activitiy ID different each time I did sampling? What is the meaning behind this activity number?

Message Edited by mfcking@yahoo.com on 06-23-2005 03:21 PM

0 Kudos
mfcking
Beginner
944 Views
Hi, Thanks for your reply. Actually we are profiling network card driver instead of some specific applicaiton. We have two NICs. Sometimes I also found the total clock ticks is equal to the average of CPU0 ticks and CPU1 ticks. But sometimes it doesn't BTW, why is this activitiy ID different each time I did sampling? What is the meaning behind this activity number?
0 Kudos
guillermo_marcus
Beginner
944 Views
Hi,

Actually, it would be better to call them Activity Result ID, to make the difference clear (and separate from the Activity short number). Each time you run an activity, it creates a Result, and it assoiates an ID with it. From my observations, this number is ever increasing and global across all projects. Now, you would also notice that it seems to skip some numbers: If you run a Clockticks+Instruction Retired sampling multiple times, you will get 42,44,46... but no 43,45,47... This is, I guess, because it assigns a number to each one of them (one to IR, another to CT), but you access only the first to access the full result set. You can also see the tree related to this by running "vtl show -all" over the project. A tip: To use a VTune eclipse project with the command line interface, use "vtl project path-to-.vpj-file-in-eclipse-workspace", being the file normally in a hidden directory inside the Eclipse project directory. See "man vtl".

Now, regarding the clockticks. First, lets remember that this is a performance counter that count cycles inside the processor. However, you can get 3 different kinds: Non-halted, Non-sleep, and Time-stamp clockticks (see the VTune Reference Manual for a description of them). They count or not depending of the state of the processor. However, if you happen to have a processor with HyperThreading, then you will looking sometimes to the logical processor, and sometimes to the physical processor counter, so the counts may not match or may overlap. Once again, see the reference manual for a more detailed explanation.

The totals, as I have seen, are no more than the sum of each of the processor columns, so they should always match. Total CPU clockticks = CPU0 + CPU1, and the percentage is just the relation between the number of samples on a processor and the total.

The sampling collector for the activity is configured to generate a sample after N number of events, for each of the events you are sampling. After the number of events happens, the sampling collector collects the sample (sic). AFAIK, while it is collecting it, Vtune does not sample the system, so it may be that depending on your case you may be loosing some data that makes your result not accurate enough. I would suggest, increase the buffer size (so it writes the sampling info to memory first, and less often to disk), or try to fine tune the collector (but do not put it too low or you are going to spend your time only collecting).

Hope it helps,
GM.
0 Kudos
Reply