Hey Korso and Roman, I am new to PCM, I also try to use Intel Performance Counter Monitor to measure L2 and L3 cache hit ratio in my custom code. I use exactly the same code and I try to use the command as following to compile the code:
g++ -O mycode.cpp -o mycode -I ./PCM/ -L ./PCM/cpucounters.o ./PCM/msr.o -lpthread
However, it still keeps on showing that
undefined reference to `PCM::getInstance()'
undefined reference to `PCM::program(PCM::ProgramMode, void*)'
undefined reference to `getSystemCounterState()'
I try to -L the other o files. It still doesn't work. Could you guys give me some hints regards how to link the PCM library to my custome code? I didn't find any detail user manual regards how to use PCM in the custom code except http://software.intel.com/en-us/articles/intel-performance-counter-monitor-a-better-way-to-measure-c... this page. Thanks a lot!
I'm guessing that you are calling the undefined reference functions with arguments that don't agree with declaration.
The declaration for getinstance() is:
static PCM * getInstance();
Is that how you are calling it?
What happens when you compile it like:
g++ -O mycode.cpp ./PCM/cpucounters.cpp ./PCM/msr.cpp -o mycode -I ./PCM/ -lpthread
The problem might also be the ordering of the object files... if g++ uses a single pass linker... you might need to repeat cpucounters.o like:
g++ -O mycode.cpp -o mycode -I ./PCM/ -L ./PCM/cpucounters.o ./PCM/msr.o ./PCM/cpucounters.o -lpthread
but I doubt this is the issue (since msr.cpp doesn't use getInstance()).
There is also the possibility that c++ name mangling is the issue... perhaps you are enabling mangling in some cases and not in others...
But to me, this error is almost always a c++ compiling/linking issue, not a problem with PCM.
Thanks for the fast reply. It works!!!
If I use the first solution you mentioned, the command line is as following
g++ -O mycode.cpp ./PCM/cpucounters.cpp ./PCM/msr.cpp ./PCM/pci.cpp ./PCM/client_bw.cpp -o 8-4 -I ./PCM -lpthread
It can compile, the warning is as following:
./IntelPerformanceCounterMonitorV2.5.1/cpucounters.cpp: In function ‘void print_mcfg(const char*)’:
./IntelPerformanceCounterMonitorV2.5.1/cpucounters.cpp:2606:61: warning: ignoring return value of ‘ssize_t read(int, void*, size_t)’, declared with attribute warn_unused_result [-Wunused-result]
./IntelPerformanceCounterMonitorV2.5.1/cpucounters.cpp:2615:65: warning: ignoring return value of ‘ssize_t read(int, void*, size_t)’, declared with attribute warn_unused_result [-Wunused-result]
./IntelPerformanceCounterMonitorV2.5.1/pci.cpp: In constructor ‘PciHandleMM::PciHandleMM(uint32, uint32, uint32, uint32)’:
./IntelPerformanceCounterMonitorV2.5.1/pci.cpp:610:65: warning: ignoring return value of ‘ssize_t read(int, void*, size_t)’, declared with attribute warn_unused_result [-Wunused-result]
./IntelPerformanceCounterMonitorV2.5.1/pci.cpp:615:69: warning: ignoring return value of ‘ssize_t read(int, void*, size_t)’, declared with attribute warn_unused_result [-Wunused-result]
Afterward, I run the code, the results is as belowing:
Instructions per clock:-1L3 cache hit ratio:0.479371Bytes read:461888
Probably. IPC should not be -1. I would start looking at why IPC is -1. Or, if there are error message before that, fix those message first.
I assume PCM compiled correctly doesn't report IPC = -1. So I would look for what you are doing differently than what 'unchanged PCM' does. I'm sorry but I don't have time to debug your code.
Sure, Rolf, the code is as following:
#define F 200
#define C 200
using namespace std;
PCM *m = PCM::getInstance();
SystemCounterState before_sstate = getSystemCounterState();
// Begin of custom code
cout<<"bingyi's code is working"<<endl;
cout<<"bingyi's code is finished!!"<<endl;
// End of custom code
SystemCounterState after_sstate = getSystemCounterState();
cout << "Instructions per clock:" << getIPC(before_sstate,after_sstate)<<endl;
cout <<"L3 Cache Misses"<< getL3CacheMisses(before_sstate,after_sstate)<<endl;
cout << "L2 Cache Misses"<<getL2CacheMisses(before_sstate,after_sstate)<<endl;
cout << "L3 cache hit ratio:" << getL3CacheHitRatio(before_sstate,after_sstate)<<endl;
cout << "L2 cache hit ratio"<<getL2CacheHitRatio(before_sstate, after_sstate)<<endl;
it seems you are missing a call to the program method. May I suggest that you add:
[cpp]m->program (PCM::DEFAULT_EVENTS, NULL);[/cpp]
[cpp]PCM* m = PCM::getInstance ();[/cpp]
I got the following result on my machine:
[plain]bingyi's code is working
bingyi's code is finished!!
Instructions per clock:0.584452
L3 Cache Misses: 6893
L2 Cache Misses: 11591
L3 cache hit ratio: 0.405314
L2 cache hit ratio: 0.409767
Hope this helps,
just wrote a post that got queued for review for some reason;
to add to that post, it seems that your matrix access is out of bounds (using F and C, instead of i and j)
I'll repost my previous comment if it gets lost.
Thanks so much for your reply!! I appreciate it very much!!
I add the the code
m->program (PCM::DEFAULT_EVENTS, NULL);
However the output turns out to be:
Num logical cores: 24
Num sockets: 2
Threads per core: 2
Core PMU (perfmon) version: 3
Number of core PMU generic (programmable) counters: 4
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 2200000000 Hz
Package thermal spec power: 95 Watt; Package minimum power: 46 Watt; Package maximum power: 145 Watt;
WARNING: Core 0 IA32_PERFEVTSEL0_ADDR are not zeroed 5439548
bingyi's code is working
bingyi's code finished!
Instructions per clock:-1
L3 Cache Misses4301948
L2 Cache Misses4301948
L3 cache hit ratio:0
L2 cache hit ratio0
It doesn't seem right to me. Is that because of the waring? Should I set IA32_PERFEVTSEL0_ADDR to zero?