I am executing a program with an 8GB dataset (quite large) being read and written with fair amount of random accesses involved. I wish to find out the number of L3 cache misses incurred during the execution of the program and am using PCM to measure the memory statistics. I am able to measure the number of memory reads and writes but the L3 cache misses reported by PCM is always 0. The code is as follows:
PCM* m = PCM::getInstance();
SystemCounterState before_state = getSystemCounterState();
SystemCounterState after_state = getSystemCounterState();
long long unsigned int L3M = getL3CacheMisses(before_state, after_state);
cout << L3M << endl;
I am using the Processor Counter Monitor (https://github.com/opcm/pcm) on Xeon E5-2650v2 dual socket server. My program is multithreaded (using openmp). Moreover, when I run the executable, I get this error :
ERROR: QPI LL monitoring device (0:127:8:2) is missing. The QPI statistics will be incomplete or missing.
ERROR: QPI LL monitoring device (0:127:9:2) is missing. The QPI statistics will be incomplete or missing.
Socket 0: 1 memory controllers detected with total number of 4 channels. 0 QPI ports detected.
ERROR: QPI LL monitoring device (0:255:8:2) is missing. The QPI statistics will be incomplete or missing.
ERROR: QPI LL monitoring device (0:255:9:2) is missing. The QPI statistics will be incomplete or missing.
Please let me know how to fix these issues.
A lot of systems with Intel Xeon E5 v2 processors are hiding the monitoring devices for QPI traffic. This is usually the explanation for the errors that you are getting. Details can be found in this article.
Have you verified that your program has access all the necessary access rights? In particular, this the programming of the PMU work?
(m->program() != PCM::Success)