Software Tuning, Performance Optimization & Platform Monitoring
Discussion around monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform monitoring

Haswell memory bandwidth

TPtac
Beginner
718 Views
Hi, Before measuring memory bandwidth with PCM, I think I need to understand the maximum (theoretical) memory bandwidth. I thought I had it figured out, but now I have a processor where I don't understand how the maximum numbers make sense. Here's an example I think I understand: Xeon E5-2630 v3 (Haswell-EP). The maximum memory bandwidth (according to ARK) is 59 GB/s. It has 4 memory channels and supports up to DDR4-1866 DIMMs. The peak transfer rate of a DDR4-1866 DIMM is 14933 MB/s, and 14933 * 4 = 59732 MB/s, so this adds up. What I don't understand: Xeon E7-4830 v3 (Haswell-EX). The maximum memory bandwidth is 102 GB/s. But it also supports up to DDR4-1866 and has 4 memory channels! So how does it get 102 GB/s? One theory is that the E7-4830 v3 has two memory controllers. While cpu-world confirms this, it also says that each controller has 2 memory channels, so it still doesn't add up. I'd appreciate any help from the experts over here. Is the number of memory controllers documented by Intel anywhere? I couldn't find it. Thanks in advance!
0 Kudos
1 Solution
McCalpinJohn
Black Belt
718 Views

The Xeon E7 processors use a buffer chip between the processor and the DIMMs.   This buffer chip has two channels on the DIMM side and one interface on the processor side.  Under some circumstances, the buffer-to-processor interface can run at 2x the frequency of the buffer-to-DIMM interface.

In this case the bandwidth comes from running the DIMMs at a slightly slower speed, which then allows the buffer-to-processor interface to run at the 2x rate.   It looks like the bandwidth comes from:

  • Buffer-to-processor: 4 channels *(2*1.6 GT/s) * 8 B = 102.4 GB/s
  • Buffer-to-DIMM: 8 channels * 1.6 GT/s * 8B = 102.4 GB/s

View solution in original post

3 Replies
McCalpinJohn
Black Belt
719 Views

The Xeon E7 processors use a buffer chip between the processor and the DIMMs.   This buffer chip has two channels on the DIMM side and one interface on the processor side.  Under some circumstances, the buffer-to-processor interface can run at 2x the frequency of the buffer-to-DIMM interface.

In this case the bandwidth comes from running the DIMMs at a slightly slower speed, which then allows the buffer-to-processor interface to run at the 2x rate.   It looks like the bandwidth comes from:

  • Buffer-to-processor: 4 channels *(2*1.6 GT/s) * 8 B = 102.4 GB/s
  • Buffer-to-DIMM: 8 channels * 1.6 GT/s * 8B = 102.4 GB/s
TPtac
Beginner
718 Views

Hi John,

Thanks, that explains it!  Do you know if the existence of this memory buffer documented anywhere?  It looks like if you know it exists, you can Google some presentations and articles discussing it, but haven't really seen it mentioned in Intel datasheets or the optimization manuals.

 

 

Thomas_W_Intel
Employee
718 Views

Yes, I agree that the memory buffers are often not discussed as prominently as other features of the platform. The datasheet of the memory buffer C112 and C114 is located here: http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/c112-c114-scalable-memory-buf...

They are also listed on ark: http://ark.intel.com/products/series/99059/Intel-Scalable-Memory-Buffers

 

Reply