Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

How to distinguish between partial and complete cache reads.

Rakhi_H_
Beginner
518 Views

Hi Everyone.

I am using an intel xeon machine with id Family 6 and model 45(2D). Is there some way to distinguish between partial and complete data line reads?

For example:

Intel® 64 and IA-32 Architectures Software Developer’s Manual, Order Number 325384-047US, June 2013, Vol 3b, page 18-42 lists that in offcore event monitoring, if 0th bit of offcore response MSR (corresponding to address 0x1A6) is turned on, then, counter counts all requests topartial and completecache lines.

Is this the case for all other ways to measure cache misses too? Example LLC references, LLC misses table 19.1 same document.

Currently for measuring no of bytes fetched I am doing something like

Total no of bytes fetched by L3 cache = CR * block size

where CR is the value of counter 3 using offcore monitoring, setting request type to  DMND_DATA_RD + PF_LLC_DATA_RD and response type to Any and Snoop to SNP_NONE.

Thanks

Rakhi

0 Kudos
3 Replies
Rakhi_H_
Beginner
518 Views

Please reply to this... I am hoping that my question is understable.. If not please tell me.

Rakhi

0 Kudos
McCalpinJohn
Honored Contributor III
518 Views

Partial cache line reads should be extremely rare.   In most system configurations they cannot even be generated as the result of user-mode instructions.  They can be generated by kernel or driver code that executes loads to uncached memory-mapped IO space, but these should be infrequent.

For the Xeon E5-2600 (06_2D) processor family it would probably be more accurate to use the performance counters in the uncore to measure traffic between the memory and L3, though you do lose the connection between the traffic and the core that requested the traffic in that case.

0 Kudos
Rakhi_H_
Beginner
518 Views

Thanks John. I had no way to know that partial cache line reads are rare and can be generated by kernel or driver code. This information solves much of my problem now.

Now that we have installed newer kernel I would be using uncore events for my experiments soon.

Thanks a ton!

Rakhi

0 Kudos
Reply