Hi, I'm doing an experiment with CPU-intensive algorithms (=PI calculation) and I'd like to use the PCM plugin for Perfmon to monitor the system.
It took me a while, but I managed to install the PCM plugin for Windows Perfmon and I also succeeded in using the Remote Monitoring capabilities of Perfmon to monitor the counters remotely. The only problem that remains now is that sometimes I get 0-values (or just gaps) when using certain PCM counters ("clockticks" is one of them) and I was wondering if this is just performance-related or did I do something wrong during installation?
It occurs at two different systems (a Sandy Bridge i5 2410m/Win7 and a Haswell i5 4250u/Win8.1), both using the WinRing0 drivers, and it also happens when I monitor locally (so executing Perfmon on the system that's being monitored). I already gave the PCM-service the highest priority, but sometimes the "gaps" also occur at Process-related counters (so non-PCM counters) and therefore I thought it could be maybe a performance issue.
Does anyone here experienced the same thing maybe once or have you got any tips? Thanks in advance!
since the PCM metrics can be shown at all I doubt it is an installation issue. Could you uninstall PCM-Service and try to run the pcm.exe command line utility for a longer period of time? Do you see zero clock-ticks in the output as well?
Thanks for your reply. I'll try uninstalling PCM-service and then the pcm.exe, but I've been only using the PCM-Service so I'll need to figure out the pcm.exe commands... In the meantime, here is another example of a failed (remote) log, but now with gaps (instead of 0-values) and you can see that it's a combination of Processor, Process and PCM Core Counters.
But it might be just a performance issue (or I'm not sure how you should call it), because when I exclude the % Processor Time related counters, then I can use all the PCM counters without a problem (see second attachment).
Roman, here's an update:
- I uninstalled the PCM-service on the Haswell system and then I ran (as Administrator) pcm.exe, which then gave the following error: "The program can't start because MSVCP100D.dll is missing from your computer. Try reinstalling the program to fix this problem". I have both the x86 en x64 versions of Visual C++ 2010 SP1 Redistributable Package installed and I checked that the missing file is present in the System32 folder... So did I do something wrong with the building process? I only have Visual Studio on the Sandy Bridge system and I assumed that if I build Release versions, I could just use the pcm executable on a different system, but maybe I'm mistaken (=I'm not really an expert in Visual Studio).
- Then, on the Sandy Bridge system, the missing-error doesn't occur but I get the errors shown in the attached file. However, a few days ago the pcm.exe did work on this system so I'm not sure why it's not working anymore.
Thanks in advance!
MSVCP100D.dll is the debug version of the library.
Does Visual C++ 2010 SP1 Redistributable Package install debug versions of libraries?
Try to find MSVCP100D.dll and copy it into PCM executable directory.
Thanks for the tips :) pcm.exe works on both systems now:
- I forgot to stop the pcmservice on the Sandy Bridge system
- I re-copied the same pcm.exe version to the Haswell system and now the dll-missing error is gone.
Though, now I'm facing a new challenge:
- Exporting to CSV: both systems start showing the counter-measurements and I used the -csv instruction, but I'm not sure how to stop the process and I can't locate the CSV file when I stop via Ctrl + S or C (see attachment)
- Also, both executable give the following error message during initiation: "Can not read memory controller counter information from PCI configuration space. Access to memory bandwidth counters is not possible". Or is this just related to optional counters?
To verify, I meant with zero values that it happens sometimes, as you can see in the attached printscreen of a CSV file, which is from a (remote*) PerfMon+PCM-service run on the SB system. So I was wondering: does the PCM plugin work in way that it sometimes can't access the counters or could it be maybe unrelated to PCM?
*remote: I use the Haswell system to remotely monitor (via wired network) the SB system
"memory bandwidth counters" are optional, you can ignore them.
Unfortunately I could not read the screenshot with csv data. (it just contains two measurements, probably need to run it longer to see the issue). Are you able to kill pcm in Windows Task Manager?
I'm not sure if I understand your question about csv correctly. The next version of PCM will have an option to specify a file for redirection, thanks to a colleague who implemented this. For now, you can simply redirect the output to a file as I have described in my blog.
Thanks Thomas, for the tip: now I understand the syntax of the command. I did search for it, but unfortunately I wasn't able to discover this useful post on my own... maybe I used the wrong keywords.
@Roman, now that I know how to put the pcm.exe output in a csv file, I will create a longer pcm.exe log for you. So, to be continued...
P.s. the previously attached PCM_ZeroValues.png was created via PerfMon with the PCM-service plugin. I realise it's not really a detailed or long example, but I included it just to give an example of what I meant with the zero values.
I had look at the metrics you posted in the screenshot. I can imagine that zero values are possible because Windows can put the cores into a deep sleep state (core parking) sometimes. As a result the core does not execute any instructions and the effective frequency drops (no active cycles.
That might be it. I'll try to reproduce the 0-values symptoms through pcm.exe, but I'm running another test at the moment.
In the meantime, I had another question: I tried to install the PCM-service on a Core2Duo system, but this failed since (I believe) it's not supported. Is it true that PCM doesn't work on certain/older systems? If so, then I know for sure I won't have to try it again.
Thanks again for all your quick and helpful responses, and I enjoy working with your program :)
Out of curiosity: What keywords did you use in your search? I included "Intel PCM" and "CSV" in the title under the assumption that this would be keywords that people use.
Here is the longer PCM log using pcm.exe. Since I hardly see any 0-values in this log, I assume that it's a PerfMon related issue. Because when I'm monitoring through PerfMon with the PCM plugin it happens roughly every 5 to 20 seconds at several counters (namely Rel. Freq., Clockticks, Instruc. Retired, Instruc./Clocktick, L2/L3 Cache Misses). Furthermore, the C0 and C6 state-counters never fail and the problem happens more often (so every 5s.) when one of the CPUs is 99-100% in C0-state.
I think I'll just start using the pcm.exe for my experiment since it seems to be a lot more reliable then using PerfMon (btw the PerfMon Process counters are also not completely stable). Furthermore, I checked whether there is any difference between Local and Remote monitoring and there isn't: the same 0-values occur in the PerfMon log when I monitor locally. Only downside of using pcm.exe: I don't know if I can use it too monitor remotely*... probably not, right?
Let me know if you'll come across the same problems in the future... Maybe I'm just creating unrealistic scenario's in which the system becomes overloaded, or maybe it's just a performance issue in PerfMon since pcm.exe doesn't seem to have these problems :)
*remotely: run pcm.exe + log data of sys2 on sys1, which is possible in PerfMon. If this is not possible, then I'll just run it on sys2. But it would be a pitty since I spend a lot of time on getting the remote performance logging feature of PerfMon to work and now PerfMon itself appears to be not stable enough, even with local monitoring...