Community
cancel
Showing results for 
Search instead for 
Did you mean: 
109 Views

Intel Memory Latency checker w/ Windows support released

We just released v2.3 of Intel Memory Latency checker (http://www.intel.com/software/mlc). This adds support for Windows o/s while previous versions already supported Linux o/s. In addition, single socket Xeon processors (E3) are also supported. 

Intel Memory Latency checker can be used to measure latencies and bandwidth on Intel Xeon processors

Vish

0 Kudos
4 Replies
McCalpinJohn
Black Belt
109 Views

I am seeing strange latency results with version 2.3.   In particular the result of the "--idle_latency" test (with no other options) is much higher than any of the values from the "--latency_matrix".    If the "--idle_latency" test is given "-c" and "-i" options to place the threads and data the results seem fine.

E.g., on a Xeon E5-2680 with cores 0-7 in socket 0 and 8-15 in socket 1, I see:

c557-603:~/Stampede/IntelMemoryLatencyChecker:2014-11-11T13:28:08 $ ./mlc_2-3 --latency_matrix
Intel(R) Memory Latency Checker - v2.3
Command line parameters: --latency_matrix 

Using buffer size of 200.000MB
Measuring idle latencies (in ns)...
	Memory node
Socket	     0	     1	
     0	  66.9	 116.6	
     1	 116.6	  66.9	


c557-603:~/Stampede/IntelMemoryLatencyChecker:2014-11-11T13:27:54 $ ./mlc_2-3 --idle_latency 
Intel(R) Memory Latency Checker - v2.3
Command line parameters: --idle_latency 

Using buffer size of 200.000MB
Each iteration took 362.0 core clocks (	134.1	ns)


c557-603:~/Stampede/IntelMemoryLatencyChecker:2014-11-11T13:27:24 $ ./mlc_2-3 --idle_latency -c4 -i4
Intel(R) Memory Latency Checker - v2.3
Command line parameters: --idle_latency -c4 -i4 

Using buffer size of 200.000MB
Each iteration took 179.5 core clocks (	66.5	ns)

 

109 Views

John,

Thanks for reporting this issue. We moved the dummy threads to 1st cpu in each socket with this release. When you invoke --idle_latency without any parameters, we ended up running both dummy thread and the latency thread on the same core resulting in higher latencies. We missed this case in testing as we typically expect -c option to be specified when --idle_latency is used. We do take care not to schedule both dummy threads and measurement threads on the same cpu but missed this one case.  I fixed the code to handle this case and the next release should include the fix.

Are you seeing any other issues? We really appreciate your testing and feedback to make the tool better

Thanks

Vish

McCalpinJohn
Black Belt
109 Views

I though that might be the problem -- I tried using the "-c" option to bind to each core on each socket and saw no degradation on any of the cores, but it looks like specifying the core activated the "collision avoidance" logic.

There are some funny numbers on one of my Haswell 2-socket boxes, but it looks like the DRAM configuration is not optimal for this 3-channel processor (Xeon E5-2603 v3).  

The rest of the results look good -- thanks for supporting this tool!

zeuch__Steffen
Beginner
109 Views

Hello,

the documentation of the Intel Memory Latency Checker states that with the option -bXXX you can specify the buffer size. For example to measure caches instead of DRAM. But this option will not considered for execution. The print message "Using buffer size of" as well as the measures values indicate that it not works. For example mlc --idle_latency –b3000 –c0 –t3 out of the documentation will not work. Is there a workaround?

Kind regards,
Steffen

Reply