FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
All support for Intel NUC 7 - 13 systems has transitioned to ASUS. Read latest update.
6321 Discussions

DDR3 UniPHY 10.1 Core Burst of 8 vs. 4 Behavior?

Honored Contributor II

I am using the DDR3 UniPHY SDRAM Controller from Quartus 10.1 with Half-Rate PHY. I want to utilize the full bandwidth of the DDR Memory interface as much as possible thereby trying to perform Burst of 8 accesses on the I/O as much as possible. I will mostly be accessing addresses in long burst of sequentially increasing addresses. I am having trouble understanding the behavior I am seeing in my simulation between the local interface and the Memory I/O interface. I am performing a series of bursts with my local burst length always set to 2. Sometimes in my simulation I see series of multiple bursts occur with burst of 8 accesses on the I/O as I would expect. Sometimes I see a series of bursts performed using a Burst Chop of 4 instead. It looks like the series of Burst Chop of 4 is occurring when the first access is only a partial write. It makes sense to me that the partial write/read is a burst chop of 4, but all the subsequent accesses in that group of sequentially increasing addresses are also Burst Chop of 4. I would have expected all of these to change back to Bursts of 8 accesses. 


Could someone, maybe from Altera, help me understand the behavior of this core better and why I might be seeing the above behavior? i.e. What causes the core to decide to use Burst Chop of 4 vs. Burst of 8? 


Also, since I have local_burst_size always equal to 2, should I set the Max Avalon-MM Burst Length to 2. What is different in the core if I have this number set to 16 instead of 2? 


Could the command queue look-ahead depth have an effect on what I am seeing? How would increasing this value have an effect on the behavior of the core? Would it change how many accesses occur before local_ready goes low? 


Thanks for Any help you can provide!
0 Kudos
1 Reply
Honored Contributor II

You may be seeing the effects of burst wrapping. SDRAM have a concept of bursts that cross a burst boundary wrapping back to earlier addresses rather than progressing linearly. Lets say your memory local interface has a max burst count of 2 and is x32 wide. That means the burst boundary is multiples of 8 bytes. So if you start your burst at 0x0, 0x8, 0x10, etc... then you shouldn't see this chopping. If you start at address 0x4, 0xC, etc... then unless your master supports burst wrapping (I don't know of many that do) then the burst needs to get chopped up into bursts of 1 to avoid crossing the boundary. 


So using the example above if my master wrote 0xAAAAAAAA followed by 0xBBBBBBBB to address 0x4 using a burst of 2 I would expect the following data cells to be stored: 


wrapping master: address 0x4 contains 0xAAAAAAAA, address 0x0 contains 0xBBBBBBBB 

nonwrapping master: address 0x4 contains 0xAAAAAAAA, address 0x8 contains 0xBBBBBBBB 


So when you have a mismatch in wrapping capabilities the system needs to assume the burst behavior of the master. In the non-burst wrapping to burst wrapping case this means that any time a burst crosses a burst boundary then it needs to be divided into smaller bursts. 


So to avoid it you may be able to make sure all accesses are aligned to a burst boundary or a multiple of it. In the modular SGDMA design example what I do is turn on an option that makes sure the master gets back into burst alignment and turn on a property that tells SOPC Builder/Qsys that all bursts that follow will be aligned on the burst boundary. The easiest way to do this is to do a bunch of bursts of 1 to get back into alignment and then burst the maximum amount multiple times back to back. 


Let me know if that makes sense, this confused me a while back when the memory controller would do this but not tell the tools it was doing it which lead to a functional failure.
0 Kudos