Processors
Intel® Processors, Tools, and Utilities
14510 Discussions

QPI over DRAM speed

idata
Employee
6,132 Views

I had a doubt about the QPI and DRAM speed.

Lets take Intel Xeon x5690 as example.

According to Intel, the QPI speed is 6.4GT/s, and the maximum memory bandwidth supported is 32GB/s.

http://ark.intel.com/Product.aspx?id=52576 http://ark.intel.com/Product.aspx?id=52576

If we convert the QPI speed to GB/s, according to wikipedia, it should be 25.6GB/s. So the maximum memory bandwidth supported in x5690 is larger than the QPI speed.

A DDR3 1333 MHZ = 32GB/s, DDR3 1066 = 25.6GB/s

As for the processor specification, the memory bandwidth is larger than the QPI, so what is the best configuration of RAM should i used? Will it increased the performance if i choosed DDR3 1333? Or the optimum should be DDR3 1066? If the optimum RAM speed is 1066, why the processor is designed to support up to 1333?

Thank you.

0 Kudos
5 Replies
idata
Employee
3,623 Views

When you consider and compare the QPI and DRAM bandwidths and speeds, don't forget that these figures are the theoretical maximum performance that the link or device will provide when operated at their specified parameters under special testing conditions. The actual performance in the real world is affected by many different things, and the result is dependent upon what tasks individual programs/processes are performing.

While the bandwidth of a Xeon X5690's QPI link on a server mother board between a CPU, an IOH, and possibly a second CPU, is less than that of the bandwidth between the Xeon's Integrated Memory Controller and it's three channel memory, is it correct to conclude that all the contents of DRAM memory will be passing through the QPI link at all times?

Imagine that one billion fractional floating point numbers are in DRAM memory, which are then added together by the Xeon CPU, and the single result is displayed on the screen. A huge amount of data moved from DRAM memory to the CPU, but one single value was sent over the QPI link to the IOH, and from there eventually sent to the monitor. So while the bandwidth of the QPI link is less than the DRAM memory to CPU connection, there is not an automatic correspondence between the entire contents of DRAM memory passing through the QPI link. The DRAM and QPI bandwidths are more independent than dependent on each other.

The overall through-put of the QPI link is much greater than the older Front Side Bus (FSB), and the difference of DRAM memory bandwidths used with FSB and QPI architecture hardware has not changed very much, so the overall performance will be much greater with QPI, despite the DRAM and QPI bandwidth differences.

Given the above, the memory speed you choose does not need to be based on the QPI link bandwidth. The tests with different memory speeds that I have seen shows that memory at 1333GT/s (GT/s is really the correct units, not MHz, although it is common to see MHz used) or 1600 GT/s is optimal, and speeds beyond that offer little or actually lesser performance.

idata
Employee
3,623 Views

Thanks for givin the explanation, i had better understanding now. Anyway, I still have something to comfirm.

As we know, Intel new cpu architecture has integrated the memory controller into the CPU chip.

So this mean that the connection between the RAM and CPU is directly connected throught the QPI Link (Previously was FSB), and the speed of this connection between RAM and CPU is determined by the speed of the RAM and it is controlled by the integrated memory controller inside the CPU chip.

The speed of the QPI link over here is different with other QPI Link such as connection between CPU to CPU (Dual processor configuration), and CPU to IOH.

So as a conclusion:

QPI Link Speed between CPU and RAM = Controlled by the integrated memory controller (Depending on the CPU and RAM specification and also up to the maximum speed that the memory controller can support, which is 32GB/s in x5690). Thats mean, it will be 25.6GB/s if i use 1066 RAM, and 32GB/s if i use 1333RAM.

QPI Link Speed between CPU-CPU, CPU-IOH = Intel design these up to 25.6GB/s for x5690.

Is my understanding correct? I understand that the above spec is base on maximum theoritical value, it is different when operated in real situation.

For your information, my company is going to purchase a workstation for FEA simulation, we are now strugling on choosing the best configuration for the RAM.

We need large amount of memory, the mobo can support DDR3 1066 up to 192GB, but for DDR3 1333 is only up to 96GB. Anyway, we will try to run a benchmark on these two configuration and see how is the result. Hopefully it will not bring much different so we can install more RAM to the machine.

Anyway, this is a good chance for learn something new...

Thanks alot!!!

Edward_Z_Intel
Employee
3,623 Views

Memory bandwidth is also limited by # of DIMM in each memory channel. For example, with Xeon 5500 series processor, you can have only one DPC (DIMM Per Channel) to run at 1333Mhz. With Xeon 5600 series, you can have 2 DPC at 1333Mhz. You may find this tool useful:

http://imct.intel.com/ http://imct.intel.com/

0 Kudos
idata
Employee
3,623 Views

exryu, Thanks for your comments. I should have said the following in my last post, as it would have helped answer your question. First, examine the picture below showing the basic architecture of Nehalem-based CPU server mother boards:

Note: This diagram is provided with information on the Xeon X6550 CPU, the X5690 CPU page does not include a block diagram.

In this diagram, the QPI links are shown in blue, and are the communication links between multiple CPUs on a server board, and one or more IOH. Notice that this QPI link does not provide the link between the CPU and it's memory. Frankly, the link used by the IMC is rather ambiguous, given what information I've been able to find. The IMC's link is described as NUMA (Non-Uniform Memory Access), and also as a Quick Path link. The best article I have found is this one:

http://software.intel.com/en-us/blogs/2009/03/11/learning-experience-of-numa-and-intels-next-generation-xeon-processor-i/ http://software.intel.com/en-us/blogs/2009/03/11/learning-experience-of-numa-and-intels-next-generation-xeon-processor-i/

Regardless, the QPI link used between CPUs and IOH is physically not the same link used by the IMC. Whatever connection type the IMC uses, the bandwidth figures for DRAM memory are accurate, and not limited by the QPI link used between CPUs and IOHs.

0 Kudos
idata
Employee
3,623 Views

Thanks for providing such a good information. (Anyway, sorry for the late reply)

It seem clear for me now. I was quite confused with the QPI and the CPU to Memory bus, I though they were the same thing. This is the first time i hear about NUMA, I should spend some time to look into for this.

0 Kudos
Reply