I ran a few tests with VTune and I found results not as they're supposed to be.
lbm_Soff_1p1c: Instructions Retired: 362,259,200,000
lbm_Soff_2p2c: Instructions Retired: 1,261,966,300,000
lbm_Son_1p1c: Instructions Retired: 1,261,717,900,000
lbm_Son_2p1c: Instructions Retired: 1,262,373,400,000
leslie3d_Soff_1p1c: Instructions Retired: 1,202,847,100,000
leslie3d_Soff_2p2c: Instructions Retired: 824,536,200,000
leslie3d_Son_1p1c: Instructions Retired: 1,552,713,900,000
leslie3d_Son_2p1c: Instructions Retired: 1,553,852,400,000
libquantum_Soff_1p1c: Instructions Retired: 2,307,256,500,000
libquantum_Soff_2p2c: Instructions Retired: 484,984,900,000
libquantum_Son_1p1c: Instructions Retired: 1,124,465,400,000
libquantum_Son_2p1c: Instructions Retired: 2,307,302,500,000
Here are examples of data that went wrong. Son/off refers to SMT(Hyperthreading) on/off. 2p1c refers to 2 threads running on 1 core, and similarly 2p2c refers to 2 threads running on separated cores. These are simple setups of my experiment. My experiment platform is intel E5 2650 v3(Haswell), CentOS 7.0.
Regardless the running environment, the total instructions retired by one single completely run process is supposed to be permanent(is this right?). Since Vtune gave a MUX reliability higher than 0.995, I suppose such mis-counts are due to the PMUs themselves. Is there any chance PMUs suffer from external impacts such as raised temperature and cause counting failures? Do other datas given by PMUs have similar risk to be trusted?
You may want to double-check that the NMI watchdog is disabled. VTune should do this, but....
CentOS 7.0 is pretty old -- can you upgrade to a newer version? (I found some bugs in Intel contributions to the kernel code in 7.2 that are fixed (sort of) in 7.3, and other bugs in 7.3 that only seem to apply to the newer Xeon Scalable Processors. I have not tested CentOS 7.4.)