I am trying out some experiments related to cache allocation technology on Intel XEON D-1541 (Broadwell microarchitecture).
As part of experiments I am first allocating full cache (12MB - Default configuration) and reading some amount of data from the memory and In the immediate next iteration I am again reading data from the same memory address with specific size as described in below table. Also, I have disabled all type of prefetcher from BIOS and I am setting 0x0041412E value to the IA32_PERFEVTSELx register.
After performing memory transaction I am reading LONGEST_LAT_CACHE.MISS performance monitoring counter (Event Num: 2EH, UMASK:41H) and also IA32_QM_CTR register with Event ID-2. I am getting data as shown below:First IterationImmediate second IterationLLC_MISS Value after second iterationIA32_QM_CTR value after second iteration12MB12MB196000 Bytes012MB11MB180220 Bytes0
If we see theoretically, I am getting correct value (0) from the IA32_QM_CTR register since I am not accessing any memory from the main memory and I am accessing all the data from cache.
I am expecting similar result from the LLC_MISS performance monitoring counter also but I am getting approximately (total transaction size/64) as result from the performance monitoring counter which indicates I am accessing all the data from the main memory.
Why these two results are different? Do I have to make any additional configurations for the LLC_MISS performance counter?
/thread/119532 KHTS, thank you for your patience.
Regarding this matter the best approach should be provided by our https://www.intel.com/content/www/us/en/design/resource-design-center.html Resource & Design Center for Development with Intel, visit this site for more information.