Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

Intel Memory Latency Checker v2.0 released

Thomas_W_Intel
Employee
5,694 Views

A new version of Intel Memory Latency Checker v2.0 (Intel MLC) has recently been posted at http://www.intel.com/software/mlc

Apart from the unloaded memory latency, Intel MLC can now measure memory bandwidth and loaded latencies as well.

0 Kudos
24 Replies
Krishnaswa_V_Intel
1,038 Views

We just released Ver 2.1 of Intel Memory Latency Checker tool. This version automatically launches spinner threads while doing b/w tests to ensure best possible memory b/w for remote accesses. Also, it takes care of measuring remote memory latencies properly on newer Linux kernels where NUMA balancing feature is enabled. Please give this a try and let us know if you have any feedback. - Thanks

0 Kudos
Ming_C_
Beginner
1,038 Views

It is a great tool but I got some strange results when I ran it on a 2-socket E5-2670 system. The idle latencies were not consistent between socket 0 and 1 though the memory bandwidth test looked fine. What could be the possible cause of inconsistent result in idle latencies between socket 0 and 1 ? Below is the result.

[root@localhost mlc]# ./mlc
Intel(R) Memory Latency Checker - v2.1

Using buffer size of 200.000MB
Measuring idle latencies (in ns)...
        Memory node
Socket       0       1
     0    88.4   150.8
     1    13.2     9.6

Measuring Peak Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using traffic with the following read-write ratios
ALL Reads        :      40054.3
3:1 Reads-Writes :      34445.4
2:1 Reads-Writes :      33157.9
1:1 Reads-Writes :      30032.6
Stream-triad like:      32967.6

Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)

Using Read-only traffic type
        Memory node
 Socket      0       1
     0  20022.8 11494.4
     1  11724.8 20030.0

Thanks!

 
0 Kudos
Krishnaswa_V_Intel
1,038 Views

Hi Ming, can you please try the following 2 commands and send me the output

./mlc --latency_matrix –r –l128 

./mlc --latency_matrix -l128 -v

0 Kudos
Ming_C_
Beginner
1,038 Views

Hi Vish,

Here is the result for the suggested commands. It seems the 1st command produced consistent latency numbers between socket 0 and 1.

[root@localhost mlc]# ./mlc --latency_matrix -r -l128
Intel(R) Memory Latency Checker - v2.1
Command line parameters: --latency_matrix -r -l128

Using buffer size of 200.000MB
Measuring idle latencies (in ns)...
        Memory node
Socket       0       1
     0    88.7   151.4
     1   151.0    88.5
[root@localhost mlc]# ./mlc --latency_matrix -l128 -v
Intel(R) Memory Latency Checker - v2.1
Command line parameters: --latency_matrix -l128 -v
OS core id:   0: Socket id:   0 Hyperthread id:   0
OS core id:   1: Socket id:   0 Hyperthread id:   1
OS core id:   2: Socket id:   0 Hyperthread id:   0
OS core id:   3: Socket id:   0 Hyperthread id:   1
OS core id:   4: Socket id:   0 Hyperthread id:   0
OS core id:   5: Socket id:   0 Hyperthread id:   1
OS core id:   6: Socket id:   0 Hyperthread id:   0
OS core id:   7: Socket id:   0 Hyperthread id:   1
OS core id:   8: Socket id:   0 Hyperthread id:   0
OS core id:   9: Socket id:   0 Hyperthread id:   1
OS core id:  10: Socket id:   0 Hyperthread id:   0
OS core id:  11: Socket id:   0 Hyperthread id:   1
OS core id:  12: Socket id:   0 Hyperthread id:   0
OS core id:  13: Socket id:   0 Hyperthread id:   1
OS core id:  14: Socket id:   0 Hyperthread id:   0
OS core id:  15: Socket id:   0 Hyperthread id:   1
OS core id:  16: Socket id:   1 Hyperthread id:   0
OS core id:  17: Socket id:   1 Hyperthread id:   1
OS core id:  18: Socket id:   1 Hyperthread id:   0
OS core id:  19: Socket id:   1 Hyperthread id:   1
OS core id:  20: Socket id:   1 Hyperthread id:   0
OS core id:  21: Socket id:   1 Hyperthread id:   1
OS core id:  22: Socket id:   1 Hyperthread id:   0
OS core id:  23: Socket id:   1 Hyperthread id:   1
OS core id:  24: Socket id:   1 Hyperthread id:   0
OS core id:  25: Socket id:   1 Hyperthread id:   1
OS core id:  26: Socket id:   1 Hyperthread id:   0
OS core id:  27: Socket id:   1 Hyperthread id:   1
OS core id:  28: Socket id:   1 Hyperthread id:   0
OS core id:  29: Socket id:   1 Hyperthread id:   1
OS core id:  30: Socket id:   1 Hyperthread id:   0
OS core id:  31: Socket id:   1 Hyperthread id:   1
Detected 2 sockets

Using buffer size of 200.000MB
Test running on 2600.00 MHZ processor(s)
Core 20 is running a busy loop to keep socket 1 from low frequency states
Core 4 is running a busy loop to keep socket 0 from low frequency states
Measuring idle latencies (in ns)...

Socket  0 (core   2) measuring latency to memory on socket  0 (allocated by core   2)..
Allocated 1600000 cache lines...
Initializing memory...memory initialized
Start loop for latency measurement...
Each iteration took 230.6 core clocks ( 88.7    ns)

Socket  0 (core   2) measuring latency to memory on socket  1 (allocated by core  18)..
Allocated 1600000 cache lines...
Initializing memory...memory initialized
Start loop for latency measurement...
Each iteration took 393.7 core clocks ( 151.4   ns)


Socket  1 (core  18) measuring latency to memory on socket  0 (allocated by core   2)..
Allocated 1600000 cache lines...
Initializing memory...memory initialized
Start loop for latency measurement...
Each iteration took 110.7 core clocks ( 42.6    ns)

Socket  1 (core  18) measuring latency to memory on socket  1 (allocated by core  18)..
Allocated 1600000 cache lines...
Initializing memory...memory initialized
Start loop for latency measurement...
Each iteration took 71.1 core clocks (  27.3    ns)

[root@localhost mlc]#

Thanks!

 

 

0 Kudos
Reply