Community
cancel
Showing results for 
Search instead for 
Did you mean: 
qdeng1984
Beginner
69 Views

What is the exact meaning of MEM_UNCORE_RETIRED.LOCAL/REMOTE_DRAM?

Hi, Could anybody tell me what does MEM_UNCORE_RETIRED.LOCAL/REMOTE_DRAM do? I collect the number of this event of my program and it varies during each execution, and when I run 2 or 3 identical processes in the same time on different cores(I bind them to those cores), the number of this event on each cores even varies more.

My program is essentially a streaming one in which almost every memory access instructions lead a LLC misses. My processor is i7 920 with 2*3G DDR3 DRAMs.
Thank you!
0 Kudos
3 Replies
Peter_W_Intel
Employee
69 Views

Quoting - qdeng1984
Hi, Could anybody tell me what does MEM_UNCORE_RETIRED.LOCAL/REMOTE_DRAM do? I collect the number of this event of my program and it varies during each execution, and when I run 2 or 3 identical processes in the same time on different cores(I bind them to those cores), the number of this event on each cores even varies more.

My program is essentially a streaming one in which almost every memory access instructions lead a LLC misses. My processor is i7 920 with 2*3G DDR3 DRAMs.
Thank you!

These eventsof MEM_UNCORE_RETIRED are for Intel Core i7 processors.

MEM_UNCORE_RETIRED.LOCAL indicates all memory references which locatein local cache orlocal memory socket
MEM_UNCORE_RETIRED.REMOTE_DRAM indicates all memoryreferences which locatein a remote socket's cache or remote DRAMon sibling core.

VTune Performance Analyzer provides doc - see VTuneHelppmn.chm to know more detail.

Regards, Peter
qdeng1984
Beginner
69 Views


These eventsof MEM_UNCORE_RETIRED are for Intel Core i7 processors.

MEM_UNCORE_RETIRED.LOCAL indicates all memory references which locatein local cache orlocal memory socket
MEM_UNCORE_RETIRED.REMOTE_DRAM indicates all memoryreferences which locatein a remote socket's cache or remote DRAMon sibling core.

VTune Performance Analyzer provides doc - see VTuneHelppmn.chm to know more detail.

Regards, Peter
Hi Peter, thanks for you reply. but I am still confused about this. I collected LLC_MISSES and MEM_UNCORE_RETIRED.LOCAL_DRAM and MEM_UNCORE_RETIRED.REMOTE_DRAM, the the value of the first event(LLC_MISSES) are 600000, but the the second is only 200000, and the third is 0. So where is the rest of 600000-200000 = 400000 LLC misses going? Neither local DRAM nor remote DRAM? My program is just simply looping over a large array which is much larger than the LLC size, with stride greater than cache line(64 bytes), every operation is a "++" on the element in the array. And I bind each processes to different cores and disabled hardware prefetch and turbo mode.
Thanks!
Peter_W_Intel
Employee
69 Views

Quoting - qdeng1984
Hi Peter, thanks for you reply. but I am still confused about this. I collected LLC_MISSES and MEM_UNCORE_RETIRED.LOCAL_DRAM and MEM_UNCORE_RETIRED.REMOTE_DRAM, the the value of the first event(LLC_MISSES) are 600000, but the the second is only 200000, and the third is 0. So where is the rest of 600000-200000 = 400000 LLC misses going? Neither local DRAM nor remote DRAM? My program is just simply looping over a large array which is much larger than the LLC size, with stride greater than cache line(64 bytes), every operation is a "++" on the element in the array. And I bind each processes to different cores and disabled hardware prefetch and turbo mode.
Thanks!

This is very interesting thing! I guess that your LLC (L3) is big, but first running that LLC is empty, second time that cache line is hot. It is not goodthat stride size is big - it impacts on performance. I suggest you to use MEM_LOAD_RETIRED.LLC_UNSHARED_HIT & MEM_LOAD_RETIRED.OTHER_CORE_L2_HIT_HITM to measure. These two events indicate L2 Data Access Misses.

Regards, Peter
Reply