Hello, I was looking at the Stratix 10 MX overview pagehttps://www.altera.com/products/sip/memory/stratix-10-mx/overview.html?utm_source=altera&utm_medium=... and have a question about its HBM2 DRAM memory. When reading the overview of the device it looks as if the DRAM has been integrated into the FPGA. I looked at the Stratix 10MX product table and on the HBM2 high-bandwidth DRAM memory it has the units of GBytes and if you look to the right it has numbers such as 3.25, 8, 16. Now that's pretty impressive amount of memory and I question myself if I'm understanding things correctly. I must be missing something here! Does the Stratix 10MX have GBytes on on-chip memory? If anyone has some information on "DRAM System-in-Package" and what it is let me know. Joe
The key part is "DRAM System-in-Package". They've basically included the hybrid memory into the package itself - see this picture: https://simplecore.intel.com/newsroom/wp-content/uploads/sites/11/2017/12/intel-stratix-10-internalw.... They are physically discrete memory chips, but they are physically attached to the interposer PCB that breaks out the pins of the FPGA. So when you receive the fully packaged FPGA, the memory is in the same package under the heat spreader.
TCWORLD,Hello, thank you for responding to my post. So, my assumption is correction, right? Well then isn't this fantastic! Having an FPGA with 16GBytes of on-chip memory is wonderful! Say, do you know where I may find the specs on access time to these memory? There must be some "cons" to this that I'm missing? Thanks, Joe
The con is cost mostly. Having the memory on the interposer is harder to do (hence cost), but everything is closer to the chip which helps performance due to lower distance and better signal integrity.Speed wise if the overview is to be believed is 10x faster than regular DDR -> "Traditional DDR4 DMMs provide ~21 GBps bandwidth while 1 HBM2 tile provides up to 256 GBps.". From what I can gather from the numbers, the 21GBps figure for DDR4 comes from 72bit (64bit+ECC) @ 2400Mbps. The 256GBps comes from 16 channels in the cube, each channel having 8 layers, each layer being 64bit wide running at 2048MHz. So the cube itself is running at a lower frequency, but the data width is effectively 8192bit wide (it's not in practice that wide as the data is serialised onto high speed lines, but deserialised it would be that wide). From what I can gather, the memory cubes connect to the FPGA directly - i.e. not through the HPS bridge. You can probably route one through to the ARM processor via the HPS bridge. The processor itself has it's own DDR3/4 EMIF.