FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
6343 Discussions

Modular SGDMA - Streaming to Memory Mapped Random first write.

Altera_Forum
Honored Contributor II
3,911 Views

Hi, 

 

I am using the Modular SGDMA to write memory into DDR3 memory, and then a second one to read it back out again. The reading is all working fine and writing into the memory works as well, except there is one minor glitch. 

 

Basically when I issue a descriptor to write a given number of symbols to, say, address 0 (the burst length is 16 and I'm using a 512bit wide data bus).  

Then, Symbol 0 arrives via Avalon-ST. It immediately gets written by the SGDMA to address 0xFFFFFFFF.  

The next arrives and it gets queued up in the FIFO. Once symbol 16 arrives, there is a burst write of 16 symbols to address 0x0. 

 

Essentially this means the first symbol to arrive for any descriptor appears to get written to 0xFFFFFFFF, and then the remaining symbols get written to memory one address earlier than they should, thus the first symbol is lost and the rest are shifted from where they should be. 

 

Any thoughts on why this is happenning? 

 

Thanks. 

 

EDIT: 

I'm using a Stratix V DSP board and Quartus 14 Subscription Edition, have SignalTap set up to look at important parts of the SGDMA controller and data input. Also the design meets timing comfortably.
0 Kudos
19 Replies
Altera_Forum
Honored Contributor II
879 Views

I've been doing some tests and can see that the following work: 

 

(1) NIOS generates descriptor (not using supplied driver). 

(2a) First 32 bit is written to controller (read address) - this is set to zero as it is a write controller and doesn't matter. Byte Enable = 0x000F 

(2b) Second 32 bit chunk of descriptor written (write address) - trying say 64 (which is an aligned address). Byte Enable = 0x00F0 

(2c) Third 32 bit is written (length) - this is set to 0x480000 which is how much I want to transfer. Byte Enable = 0x0F00 

(2d) Fourth 32 bit word written. I have bits 8,9,12,24 and 31 set. Byte Enable = 0xF000 

- All of those transactions have been checked in signaltap and I can see the correct data and byte enables being sent along with the write signal being asserted correctly. 

(3) I can see the SGDMA dispatcher issue the write descriptor, but it seems that the data that gets sent to the write master via the write commands source port is basically just a whole lot of 1's (apart from the 'park writes' and 'transfer complete IRQ' bits which are 0). 

 

For clarification I have the following settings for the IP core: 

 

Streaming to MM 

Packet Support = Disabled 

Max Write Length = 512MB 

Descriptor FIFO = 8 

Data FIFO = 64 

Burst Count = 16 

Forced Burst Alignment = True 

Burst Enable = True 

Transfer Type = Full Word Accesses Only 

Data Width = 512 

Response Port = Disabled. 

 

 

No idea why the dispatcher is turning my descriptors into garbage when the descriptors bus at the input is presenting the correct data/control signals.
0 Kudos
Altera_Forum
Honored Contributor II
879 Views

Interestingly, despite the fact I have the 'Length' parameter set to 512MB, Quartus synthesizes away bits [31:16] of the length signal in the descriptor FIFOs. 

 

EDIT: 

QSys for some reason put a length signal width of 16 in the generated source file instead of 30. I've tried regenerating the QSys system, but each time it sets the 'LENGTH_WIDTH' parameter on the write_master to 16?!? Is this a glitch in QSys?
0 Kudos
Altera_Forum
Honored Contributor II
879 Views

I've also noticed from the signalTap traces, that for some reason, each time I write a 32bit symbol to the descriptor (with the correct byte enables set to 0), the other bytes in the descriptor get overwritten with 0xFFFFFFFF, even though the byteenables shouldn't allow that to happen. As a result the descriptor is getting corrupted by the time it leaves the dispatcher. 

Any suggestions why this would be happening?
0 Kudos
Altera_Forum
Honored Contributor II
879 Views

I think I have worked out the problem. In the fifo_with_byteenables module in the dispatcher, there are MLAB cells inferred for the FIFO. Each of these has a byteenable, but they are registered inputs. If the Avalon-MM interconnect to the descriptor input sets up the byte enable signal at the same time as the write signal, the write is going to reach the MLAB cell one clock cycle before the byte enables reach it (because they are pipelined). If I am not mistaken this means that the MLABs will write the data before it is properly masked causing corruption. 

Or is there something here I am missing?
0 Kudos
Altera_Forum
Honored Contributor II
879 Views

Hi TCWORLD, 

 

I haven't used the Modular-SGDMA controller, so cannot comment on what you are seeing, however, I can recommend that you create a Qsys system with this component and an Avalon-MM BFM, and then create a testbench that contains exactly the transactions you are debugging, so you can reproduce this "error". At that point you can submit a Service Request to Altera directly, or if you post a zip file with a complete simulation here, someone familiar with the IP core may take a look at the problem. 

 

If you have never used the BFM before, I've posted examples, eg., go to post# 25 in this thread 

 

http://www.alteraforum.com/forum/showthread.php?t=32952&page=3 

 

Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
879 Views

Sounds like a good idea. I'll build a testbench tomorrow (I'm familiar with the BFMs, so shouldn't take too long to test).

0 Kudos
Altera_Forum
Honored Contributor II
879 Views

Hmm, it seems to work fine (exact same setup) in simulation, but not in implementation.

0 Kudos
Altera_Forum
Honored Contributor II
879 Views

 

--- Quote Start ---  

Hmm, it seems to work fine (exact same setup) in simulation, but not in implementation. 

--- Quote End ---  

 

At least that tells you something ... :) 

 

Are you able to compare all the Avalon-MM signals from SignalTap II traces with exactly the same traces from the simulation, i.e., probe the bus signals at the SGDMA controller, rather than at the BFM, just in case the Avalon-MM fabric has the bug ... but its not affecting your simulation for some as-yet-undetermined reason? 

 

Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
879 Views

I've been probing in signalTap and modelsim the signals directly at the SGDMA block. They seem both the same where it matters (i.e. there are don't cares in the simulation which have some value in signaltap). 

 

It seems that the problem is when the MLAB cells for the 'fifo with byte enable' block have their ena signal high, but their portabyteenamasks signal low, for some reason the contents gets written to 0xFFFFFFFF. But the same cells for a second SGDMA controller in the design (MM->ST one) are the same MLAB type, but work fine.
0 Kudos
Altera_Forum
Honored Contributor II
879 Views

Interesting, I increased the command FIFO depth from 8 to 32, and the problem appears to have miraculously fixed itself. Will do some more testing to check it is definitely working.

0 Kudos
Altera_Forum
Honored Contributor II
879 Views

TCWORLD, 

 

Can you provide some of the C code you used to control the dispatcher. I am trying to save streaming data to SDRAM on the De0-Nano board, but even though I can write in the descriptors, the dispatcher doesn't execute them. So I'm wondering what is wrong.. I followed the same code procedure seen in the Modular SGDMA demo on alterawiki. Are you using the standard or extended dispatcher, i've tried both but neither worked for me. not sure where the error is. 

 

Thanks. 

 

My qsys set up is as follows: 

 

Transfer mode: streaming to memory 

Descriptor fifo: 32 

 

Data width: 32 

Length width: 20 

FIFO depth: 64 

Burst Enable: on 

Maximum Burst Count: 16 

Force Burst Alignment: on 

Full Word Access Only
0 Kudos
Altera_Forum
Honored Contributor II
879 Views

I had all sorts of fun trying to get those controllers to work. I ended up writing my own Avalon-MM width adapter for the dispatcher, it seems the Qsys generated interconnect was causing all sorts of problems (maybe a glitch in Qsys?). That got it to sort of work, but in the end I just ditched them as they were proving quite flaky - sometimes working, but then locking up for some unknown reason. (I've been slowly ditching most of the Altera IP cores from my designs as they seem to be more trouble than they are worth).  

In the end I wrote my own DMA controller, which to be honest I should have done in the first place, I was just being lazy at the time.
0 Kudos
Altera_Forum
Honored Contributor II
879 Views

That is sort of what I expected... All though this is bad news for me since I'm pretty beginner with Verilog. Do you have any guides or tips on how to write a DMA controller?

0 Kudos
Altera_Forum
Honored Contributor II
879 Views

I've attached the Nios code I was using. 

 

In your main c file,# include "ScatterGatherController.h", then declare a global variable as: 

 

PSGC_CONTROLREGS const dma_csr_base = (PSGC_CONTROLREGS)(<Base address of the CSR, e.g. from system.h>) 

PSGC_DESCRIPTOR_FORMAT const dma_descriptor_base = (PSGC_DESCRIPTOR_FORMAT)(<Base address of the descriptors, e.g. from system.h>) 

 

You can then pass those pointers to the functions in the .h/.c files as the 'base' variable. 

 

You'll need to initialise the controller. This can be done using sgc_initialiser(), sgc_enabler(), and sgc_enableDispatcher(). See the .h file for the function declarations. 

 

If you want to send a descriptor, you can create the following: 

SGC_DESCRIPTOR_FORMAT descriptor; 

descriptor.write_address = ...; 

descriptor.read_address = ...; 

descriptor.transfer_length = ...; 

descriptor.control.bits.<some field, see .h> = ...; 

 

sgc_issue_descriptor(dma_descriptor_base, &descriptor) 

 

 

 

 

Hope the SGDMA controller proves more reliable for you than it did for me.
0 Kudos
Altera_Forum
Honored Contributor II
879 Views

Thank you! I'll let you know if it works in my application.

0 Kudos
Altera_Forum
Honored Contributor II
879 Views

TCWORLD, 

 

So I was able to implement your code, however I still get the same result: The DMA descriptor FIFO fills up and then stays filled - so no descriptors are actually being executed. 

 

Let me outline my system (maybe I'm making a fundamental error): 

 

The goal is to transfer streaming data into the SDRAM chip on the DE0 Nano board. I want the code (.text,.heap,.stack, all of it) and the descriptors to be held in on - chip memory, while the streaming data is saved on sdram. 

Therefore, in my linker script in the BSP setup, everything is set to onchip memory. 

In qsys, the data_write_master from the dma_write_master module is connected to the s1 port on SDRAM. The nios data and instruction masters are connected to onchip memory, and the data master is also connected to sdram s1 port for a way to read data for testing purposes. 

 

So essentially, code and descriptors are on onchip memory, data should be saved to sdram via msgmda. 

 

For some reason, when I used the C functions provided with the msgdma and when I used you ScatterGatherController files, I got the same result of the descriptor FIFO filling up, but no data actually being saved. 

 

My data (for testing purposes) is currently a counter that increments every N number of clock cycles (N = 1000). So the data rate is 50MHz (base clock) / 1000 = 50kHz. Every N clock cycles a valid signal is triggered. 

 

I can't find an issue, I've tried various combinations of burst vs. non bust, full word vs. aligned options, etc. No change in results. Let me know, if I'm missing something really obvious, or if you think of some reason that can be causing this issue. It would help a lot. 

 

Thanks!!
0 Kudos
Altera_Forum
Honored Contributor II
879 Views

As I said, I gave up on the Altera IP Core. It seems far too glitchy and unstable to be used in any serious system. Better off finding a different one, or writing your own.

0 Kudos
Altera_Forum
Honored Contributor II
879 Views

Good news TCWORLD. Seems I was able to fix this dispatcher issue. Apparently I had a typo in some of my custom verilog code I wrote, and it was preventing data from being sent to the sgdma. Therefore the descriptors weren't executed since the stream data was never valid. I've only done some preliminary tests so far.... 

 

My next operation would be to set up a MM - MM sgdma to transmit the data in SDRAM to external flash. Any suggestions on an open core flash memory controller (SPI not CFI)? 

Also I'm concerned with writing and reading to SDRAM - this may cause collisions and/or corruption of data, but I will have to see what happens. 

 

Thanks for the ScatterGatherController code.
0 Kudos
Altera_Forum
Honored Contributor II
879 Views

Hello, guys. I"m also trying to use mSGDMA IP. 

So I have a question regarding data length. 

Did you try to use data length not multiple of data bus width? 

My configuration is as follows  

 

Transfer mode: MM->ST 

Data width: 32 

DAta FIFO depth: 256 

Descriptor fifo: 128 

Maximum Transfer Length: 1 KB 

Length width: 20 

Transfer Type: Unaligned access 

Burst Enable: on 

Maximum Burst Count: 64 

Force Burst Alignment: off 

Packet Enable: On 

Channel Enable: On, 4 

Error Enable: On, 8 

 

From what I see in simulations: 

If I set Descriptor as follows: 

Start addr: whatever (let's take 0x0) 

Length : 51 ( any not multiple of 4 bytes) 

Flags: Go,SOP,ErrDonEn,Channel,Error,!EOP 

 

So msgdma initiates two transfers: 

1. Read burst request for 12 * 4 bytes 

2. Read request for the last data for 3 bytes 

 

But in the result on ST interface 

we get 52 bytes without any notice that the last byte is invalid (cause no EoP thus no empty signal) 

 

So does anyone knows method of transferring not multiple of bus width bytes via MM->ST interface?
0 Kudos
Reply