Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Ram latency is too high when system is idle


Hello, I had originally posted this question at the Processors section of the forum but I believe here is more appropriate.

I have noticed that two of the systems I have access are exhibiting RAM latencies that are too high (500ns), but only when the system is idle. The result of this is the unprecedented behavior that an application's performance is enhanced if there is another application running concurrently in the system.

The latency for a given thread is better even when all the other cores in the system are performing memory accesses at maximum throughput than when only one said thread is running.

This can be observed while running the Intel® Memory Latency Checker loaded latency test.

For one of the systems, the output is this:



Inject  Latency Bandwidth
Delay   (ns)    MB/sec
 00000  380.24    68920.1
 00002  380.22    68886.4
 00008  380.38    68881.2
 00015  380.11    68846.8
 00050  376.15    68501.7
 00100  372.86    68304.9
 00200  282.19    69399.9
 00300  115.94    51590.0
 00400   96.13    39300.5
 00500   92.62    31755.3
 00700   87.91    23047.5
 01000   85.88    16441.6
 01300   84.75    12858.1
 01700   83.99    10033.5
 02500   83.41     7084.0
 03500   85.19     5267.4
 05000   92.23     3857.7
 09000  122.13     2283.1
 20000  219.69     1083.2




As expected, the latency descreases as the load from the rest of the system decreases, but when this load decreases below a certain point, the latency increases back. When the load is close to nonexistent, the latency is at its highest.

I also observed this high latency when running my own custom microbenchmarks. On these benchmarks, I noticed that the latency for fully random accesses that cross pages is almost one microsecond.

This happens in two of the dual-socket Xeon systems which I have access (4114 and 4214). Both these systems are from the same supplier, and I can't observe this in the other systems which were bought through other vendors.

I've attached the full outputs for the latency checker (mlc) and dmidecode.

Does someone have a clue of what's going on here? I guess it's a configuration issue but I don't have access to the BIOS of these systems at the moment and can't check memory timings. It feels like the memory goes to sleep after one access and if there is more accesses happening it stays awake, but I'm not aware of what may be causing this behaviour. 


0 Kudos
1 Reply
Honored Contributor III

From the dmidecode output, this system is very poorly configured for memory bandwidth.  Each of the two sockets supports 6 DDR4 DRAM channels, but in this system only 2 of the 6 channels in each socket have installed DRAM.  This limits the system to approximately 1/3 of the potential memory bandwidth.

The specific observation of high latency at low load looks like the sort of result I have seen from overly aggressive power-saving settings.   I have never tried to dig into the details in such cases -- fiddling with BIOS energy-saving and performance options has always been enough to make it go away....

The MLC output shows ridiculous remote memory latency and ridiculous remote HitM latency when data is homed in the Writer's socket,  but very reasonable remote HitM latency when data is homed in the Reader's socket (essentially identical to what I see on a 2-socket Xeon Gold 6142 or a 2s Xeon Gold 5120 system).

I supposed it is possible that at least part of the weird behavior is due to the partial memory configuration?

0 Kudos