I don't know of any public documentation on the QPI protocol. Most of the information that I have been able to come up with is implicit in the performance counter events described in the uncore performance monitoring manuals for the various processor generations.... The information about the QPI protocol is spread across multiple sections of those documents, particularly the QPI Link Layer, the Ring to QPI (R3QPI) interface, but there is also a lot of information in the CBo and HA sections -- particularly in relation to matching on QPI message class and opcode.
You did not mention what sort of rates of HITM events you are seeing. Are these events happening at anywhere near the rates you are expecting for memory accesses for your benchmark, or could they be due to background OS activity?
Running linux with cores isolated, housekeeping threads moved out of the way, and no_hz mode. Should have said that, so not OS activity. I will confess that the remote HITM events I see are whatever perf c2c considers HITM events. perf c2c does show the line of the code the event occurred, so I am sure it is happening in my code and at expected places. It just seemed very odd to me that the modified state of one socket would be transferred to the other socket on a read-only request.
I was trying to find some additional resources. This paper was helpful. Maybe the read request is acting like the RFO case described, so it is moving the modified state to the other socket. With some additional prefetching, I seem to be able to hide most of the latency and hits that were occurring. I wish there was better documentation around QPI for sure.
There are lots of choices in how to handle Data Read requests that hit Modified lines.
Thanks, John. Appreciate the insights. It will be interesting to see if the same thing is happening on Skylake Xeon, just got a test box, will try it out. I am running in early snoop mode as the latency profile seems better than HS w/ Dir+OSB mode, but I will be re-testing that as well for the worst case performance.