Abnormal C2C latency in intel memory latency checker when setting window size to 1KB

111alan · ‎03-14-2020

When doing C2C test I noticed that, if I set the window size to 1KB, the latency result will be extremely low, even lower than the receiving core's local L2 cache. If I set the window size to anything larger than 1KB the results will be generally normal. The following picture is the list of results, ran with thread 2 to 0 and -e -r.

Is there a technical explanation to this phenomenon, or is this just a software bug?

Thx. The CPU tested are XEON platinum 8275 and 8260, the document for C2C latency test used is here:

mlc --c2c_latency –c2 –w22 –b200000 –C128
The above command is used to measure the time taken to transfer a modified line from L2 cache to another core on a different socket. Writer thread ‘w’ pinned to cpu 22 modifies 128KB of data (as specified in –C parameter) and transfers control to reader thread ‘c’ on cpu 2. Now, this thread reads the same 128KB of data that is currently resident in L2 of thread 22. Since those lines are in M state, the snoop responses would be Hit-modified (aka HITM) and the line would be transferred from the cache to the requester. Then the control is transferred back to the writer thread and this thread would move the window to another 128KB range in the buffer specified by –b parameter and the process will be repeated.