Showing results for 
Search instead for 
Did you mean: 
Honored Contributor I

MODULAR SGDMA burst read issue

Hi everybody, 


I'm using PCI compiler 32b master/slave combined with modular SGDMA. I'm simulating system with modelsim 6.5. 

I'm using the Modular SGDMA dispatcher associated with the SGDMA read master module (memory mapped to stream).  

When I'm trying to do a burst read on the PCI bus it doesn't work. Perform only single cycle read. 

SGDMA read master settings : Burst Enable, max burst count = 128, force burst alignment enable. 


This is a known issue? 


After a little change in read_master.vho (changing the master_burstcout value) it performs burst read with the good burstcount but wrong address incrementing appears. 


With the same change in write_master.vho (Dispatcher + SGDMA write master) it seems to work. 


Thanks a lot.
0 Kudos
7 Replies
Honored Contributor I

The force burst alignment is causing this. Basically that option makes sure that the master doesn't cross a burst boundary in the middle of a burst. You can turn it off and the master will just present large bursts that cross burst boundaries or you can just align your data to a burst boundary (data_width_in_bytes * max_burst_count = burst boundary size) 


The burst alignment option is really only useful for SDRAMs since they have burst boundaries. I won't get into the details but this option is to make sure you can get back into alignment as soon as possible in order to start posting large bursts and getting the maximum memory efficiency possible. With the HP2 SDRAM controller you can set the local burst length to 1 and avoid all this burst boundary stuff altogether.
Honored Contributor I

Thank you for your response, 


Even with force burst alignment disable , I have the same problem. 

There is a strange thing : I need to change the burstcount value of read and write master cores (in VHDL sources). This value stay at '1' no matter "Maximum burst count" value, is that normal?  

But with read master core (after changing the burstcount value) I still have the incrementing issue. On the other hand, Write master core works well in burst mode, with same settings. 

So, if it's not a bug, what could be the reason (settings, word alignment...)? I tried with different settings in SOPC, but same problem... 


So if you have any suggestion... 


Thank you
Honored Contributor I

This sounds like a bug I fixed a while back, make sure you are using the updated version from here: 


If the one you are using has a 9.0 directory in the zip file then you have the old one that had a burst truncation issue in both masters. Make sure you pass a start addresses that are compatible with the word alignment settings you are using. For example don't select aligned addressing then pass in some location that starts on a byte boundary for example. 


Hacking the burst count is not recommended as there is a lot of logic dependent on the burst count (like the address incrementing logic). 


If it is still failing I recommend capturing a transfer with all the read master internal signals captured.
Honored Contributor I

Ok, I have the previous version, now with IP core 9.1 burst transactions works well. 

I'm also trying concurrently the Altera SGDMA IP, but I noticed some problems, there are some ambigious things, particulary when we use the bit "busy".  

What is your point of view about this IP? 

Do you recommend to use yours instead? 


Thanks a lot
Honored Contributor I

The modular SGDMA is built in such a way that it can become a drop-in replacement for the SGDMA on the Altera ACDS installation (software API is different though). The thing that is missing is descriptor pre-fetching. One of these days I'll build a prefetching block that sits in front of the dispatcher; however, there are other ways to implement this sort of thing as well. 


I don't know much about PCIe but one of my ideas was to use a Nios II core to coordinate the descriptor prefetching. So the CPU would be responsible for shoveling the descriptors into the modular SGDMA dispatcher block. It would require some software based protocol to be in place so that the CPU knows how to find all the descriptors in main memory on the other side of the PCIe link. Once the chain length is know the CPU could just do a single DMA transfer to pull the descriptors into the local memory hooked up to the FPGA in order shoot them off to the SGDMA dispatcher really fast. Then you could do all kinds of things like implement virtual channels and whatever else people do with PCIe links. 


I find that the modular SGDMA is a lot easier to use from the software side so you might find that important enough to use it over the SGDMA on the ACDS. When I designed it I had three main applications in mind: 


- PCIe, SRIO, networking, etc... 

- Video frame buffering 

- Hardware accelerator DMA frontend 


Also since it's modular you could design your own controller to replace the dispatcher if you wanted something highly tuned to PCIe. That's the tricky thing with DMA design, it's pretty tough coming up with a 'one size fits all' solution. So by breaking it down into smaller parts I tried to give something generic enough to be useful and when that falls short the simplest block (dispatcher) can be replaced without having to redesign the data plane side of the DMA (masters) which is the hard part. 


I hope that helps explain what the modular SGDMA should be used for. In terms of which I recommend.... that boils down to whether you want to use IP bundled on the ACDS or a design example. As a heads up I had the bug list for the SGDMA on the ACDS in front of me the entire time I was implementing the modular SGDMA and I probably addressed 90% of them so that may give you a 'warm fuzzy' about the modular SGDMA :)
Honored Contributor I



Descriptor prefetching is not essential, with our software descriptors processing, it's better with no prefetching. 

I don't want to put a nios softcore because on the other side of PCI bus there is our main processor which handled the descriptors processing and high layers network protocol stacks. And I would minimize FPGA ressources... 


We already acheive quite a lot of work with the ACDS SGDMA, our application (NIC board) is working but we have some random problems (perhaps software issue...). But our descriptor processing with ACDS SGDMA is not efficient due to SGDMA core specifications. 


If we couldn't get better yield, we will start with your SGDMA. In this case, I could give you our feedback if you are interested. 


In your last message, you talk about a SGDMA bug list, this is an official list? It could be very interesting to see it! 


Honored Contributor I

I'm always open to feedback, if you have any go ahead and send me a PM with the details. 


The list I was looking at was an internal list but when I did a search for "SGDMA" on a lot of the items returned looked familiar to me.