FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP

Slow DDR2 reading

Altera_Forum
Honored Contributor II
1,219 Views

Hello, 

I'm implementing a FIFO using the DDR2 RAM in a 3C120 development board. Basically, I have two memory pointers, one for writing and one for reading, and I sequentially write 255 64-bit words and then sequentially read 255 64-bit words. The write process is quite fast (it tooks a total of 4.2us for the 255 words), but the read process is a lot slower, taking almost 14x times (a total of 58.4us). I'm not using bursts, but the addresses both during write and during read are sequentially incremented. In the read process, the waitrequest signal remains asserted 24 cycles, while in the write process it is asserted only in one cycle. 

Is this behaviour normal and expectable? Or I am doing something really wrong? 

Thanks! Regards, 

Javier
0 Kudos
11 Replies
Altera_Forum
Honored Contributor II
537 Views

It depends how You assert other signals. E.g. I use simple write, but burst read in my DMA. 

 

Add some code and signaltap screenshots if possible.
0 Kudos
Altera_Forum
Honored Contributor II
537 Views

Hello, 

The DDR2 controller is configured at full rate and full rate bridge, with local interface width 64-bit. It is only connected to one master, that checks from an on-chip FIFO if 255 64-bit words are available, and sequentially reads from it and increases the write address, and a data available counter. Then, it checks if there is enough room in an on-chip output FIFO for 255 words, and if so, and if the data available counter is 255 or more, reads 255 words and writes them to the output FIFO, and repeats the cycle. 

Instead of signal-tap captures, I've logic analyzer captures. One shows the complete write process, and the beginning of the read cycle, and the other the end of the write process and the start of the read process. The relevant signals are: 

D1(2): avalon read 

D1(3): avalon write 

D1(4): avalon waitrequest 

Regards, 

Javier
0 Kudos
Altera_Forum
Honored Contributor II
537 Views

The waveform is damn hard to read, I strongly suggest signaltap. 

 

Anyway, what's the frequency of the memory controller and the whole logic? I'd say it's different, so SOPC inserts a slow handshaking adapter between different speed data.
0 Kudos
Altera_Forum
Honored Contributor II
537 Views

I will try to get some with signaltap. The memory controller is set to use 125MHz from a 50MHz clock, and the logic that manages the memory uses as its clock the sysclk output from the memory controller.

0 Kudos
Altera_Forum
Honored Contributor II
537 Views

So the whole system runs at memory sysclk, which is 125MHz?

0 Kudos
Altera_Forum
Honored Contributor II
537 Views

Yes, that is

0 Kudos
Altera_Forum
Honored Contributor II
537 Views

Hmm, strange then. No other components connected to that memory controller? E.g. Nios CPU.

0 Kudos
Altera_Forum
Honored Contributor II
537 Views

Might be worth tring to find out if the memory controller is buffering writes in order to do a burst write to the DDR memory, but always doing an actual read transfer. 

 

Some simple timing calculations I did with the SDRAM interface implied it initially latched a write request, then moved it into a burst sized 'write assembly buffer' before finally writing out the accumulated burst. 

(It would definitely accept 3 random writes without additional waits.) 

 

Read requests, of course, cannot be latched the same way! But could be quickly responded to from a single 'burst read' buffer. 

 

I was only interested in the timing for random reads and writes.
0 Kudos
Altera_Forum
Honored Contributor II
537 Views

Yes, but to a different slave interface, and currently it has no functionality implemented, only the signals are declared. Anyway, I will try to remove it completely and test again (I will only need it to check the amount of data stored). I've also just checked that the input and output on-chip FIFOs (that are dual clock ones) are driven by the same DDR2 controller sysclk

0 Kudos
Altera_Forum
Honored Contributor II
537 Views

I'm not 100% sure, but I think the behaviour is pretty normal if you don't use burst reads. 

Each read requires a complete command on ddr2 side (including precharge row and column selection) with its associated delays and latencies. On the other hand the writes are pipelined and no wait for ready status is required. 

Refer to the timing diagrams on pages 3-20 and 3-22 of the sdram-ddr2 controller core user guide. 

For the read I counted exactly 14 clock cycles between start of the read command and the assertion of ready signal. 

If you enable burst tranfer you can greatly enhance the performance, since you are reading sequential addresses.
0 Kudos
Altera_Forum
Honored Contributor II
537 Views

That is was I was starting to suspect, that this is the normal behaviour, and that the best idea is to use bursts. I was trying to avoid it for simplicity, and due to the odd number of words... but well, I will have to read 256 words and ignore the last one (also not taking it into account to update the counters), or do burst readings except for the last words. Perhaps it is even better to also write a dummy word so all addresses are ever aligned to power of two boundaries. 

Thank you very much to all for your help!
0 Kudos
Reply