- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When doing C2C test I noticed that, if I set the window size to 1KB, the latency result will be extremely low, even lower than the receiving core's local L2 cache. If I set the window size to anything larger than 1KB the results will be generally normal. The following picture is the list of results, ran with thread 2 to 0 and -e -r.
Is there a technical explanation to this phenomenon, or is this just a software bug?
Thx. The CPU tested are XEON platinum 8275 and 8260, the document for C2C latency test used is here:
mlc --c2c_latency –c2 –w22 –b200000 –C128 The above command is used to measure the time taken to transfer a modified line from L2 cache to another core on a different socket. Writer thread ‘w’ pinned to cpu 22 modifies 128KB of data (as specified in –C parameter) and transfers control to reader thread ‘c’ on cpu 2. Now, this thread reads the same 128KB of data that is currently resident in L2 of thread 22. Since those lines are in M state, the snoop responses would be Hit-modified (aka HITM) and the line would be transferred from the cache to the requester. Then the control is transferred back to the writer thread and this thread would move the window to another 128KB range in the buffer specified by –b parameter and the process will be repeated.
Link Copied
0 Replies

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page