FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
6343 Discussions

DMA & FIR Filter, some questions.

Altera_Forum
Honored Contributor II
1,552 Views

Hello everyone, 

 

I'd like to implement a FIR filter (because it's easy to start with) in DSPBuilder. It should make use of DMA. I finally want to integrate it into an NIOS II based system using SOPC-Builder. I'm struggling at how to integrate such signal processing into an SOPC component. 

 

I had a look at several tutorials regarding Avalon interfaces, but I still don't know what approach would be the best. 

 

I see mainly the following solutions: 

 

- Building the complete SOPC-component in DSPBuilder by using one Avalon MM Slave block (for control) and two Avalon MM Master blocks(read/write for doing DMA). Seems a lot of work for such a simple task, especially if you want pipelined/burst masters. 

 

- Making use of some master templates like this one altera.com.cn/support/examples/nios2/exm-avalon-mm.html. Using HDL Import I can bring them into DSPBuilder and connect some logic around it, but what would be the correct way to bring the whole system into SOPCBuilder? Compile and add per generated *.tcl script, or maybe export HDL and use the SOPCBuilder's 'new component' dialog? 

 

- Could I use the DMA-Controllers which are integrated in SOPCBuilder (DMA, SGDMA) to connect to some DSP-logic, what kind of interface would this logic need. I can imagine how to use these controllers to copy e.g. from the SRAM to the SDRAM, but again how do I get some signal processing in between. 

 

Any hints and tips appreciated. 

 

Best Regards, 

Sebastian
0 Kudos
15 Replies
Altera_Forum
Honored Contributor II
399 Views

Here is one way: http://www.altera.com/support/examples/nios2/exm-accelerated-fir.html It's based on the master templates you have found already. 

 

But if it was me what I would do is build a FIR filter with a streaming input, streaming output, and a slave port so that the coefficients can be updated. Then I would take the filter and wedge it between the read and write master of this DMA engine: http://www.altera.com/support/examples/nios2/exm-modular-scatter-gather-dma.html I'm referring to that source port on the left (read master) that wires directly to the sink port on the right (write master). That's a plain Avalon-ST connection so if your filter supports ST then it should drop in pretty easily. You can configure the DMA engine with the bare minimum features you need which should keep the resource utilization down to..... 600LEs or less and probably 6 M9K blocks. If you rip the dispatcher frontend off and build a non-buffering one you can probably reduce the memory utilization to two on-chip memory blocks.
0 Kudos
Altera_Forum
Honored Contributor II
399 Views

I'll keep that modular SGDMA design in mind, but for now I really want to be things as easy as possible. I also knew this Accelerated FIR Design, but because I'm rather unexperienced in HDL I wanted to do the same just in DSP Builder. 

 

Can you give me any hints here?  

 

For a simple proof of concept/workflow I made a DMA test model in Simulink which should do memory to memory transfers using these MM Master templates within HDL import blocks (finally integrating the fir in simulink shouldn't be a problem at all). But either I connected the blocks in a wrong way or I'm not using the right 'export' method out of simulink, because a short test within a small nios-driven system didn't work. 

I'm at home now, but tomorrow I will attach the simulink model. For an experienced person it is probably quite obvious what might be wrong. 

 

Regards, 

Sebastian
0 Kudos
Altera_Forum
Honored Contributor II
399 Views

The accelerated FIR design is pretty old so I would only use it as a reference if you decide to build a FIR that has masters sticking out of it. I also don't recommend this as it makes doing verification more difficult (easier when each block is separate). 

 

The master templates are old as well and use a custom handshaking between the user logic and the master. It could be that the handshake between your logic and the master template is not timed correctly. As an FYI one of these days I plan on removing those master templates on the web and replacing them with the masters from the modular SGDMA and some simple handshake blocks that you can export to the top so if you can't find them check alterawiki.com since that is where I'll put them. 

 

I don't use DSP Builder myself since I'm an embedded guy, but I would think you want to verify the filter logic using the DSP tools and DMA logic using the SOPC Builder tools separately. Again this is easier to do when the functionality is separated using multiple components that use standard interfaces. SOPC Builder is also capable of outputting simulation files so that you can simulate the system (including Nios II code running on the processor).
0 Kudos
Altera_Forum
Honored Contributor II
399 Views

Your right, what I actually want is only the DSP logic in DSPBuilder. But with the ordinary SOPC-DMA controllers I don't get the DSP logic in between.  

 

So using the modular SGDMA the following steps should work, shouldn't they? 

 

- Use SOPCBuilder to build a nios system 

- integrate the modular SGDMA as three seperate components into SOPCBuilder (via the 'new component' dialog and setting the interfaces right) 

- connect the dispatcher and the read and write masters through the command&response sinks/sources 

- connect the Streaming FIR/DSP logic with the datapath by using the data source of the read master and the data sink of the write master 

 

I wouldn't need to write any HDL, would I? 

 

Regards, 

Sebastian
0 Kudos
Altera_Forum
Honored Contributor II
399 Views

That's correct you shouldn't need to manually write any HDL. Once you are done verifying your filter using the DSP tools you can use the HDL for the filter and import it into component editor from within SOPC Builder. Component editor creates a .tcl file that describes your component with information like interfaces, signal direction/width, timing characteristics, etc... Once that .tcl file is created then your filter will show up in SOPC Builder just like any other component in there. 

 

For something like this what I would do is after you have validated that the filter is working correctly use the same test vectors when you validate the entire system with the DMA and filter included. So you would populate the input vector into memory, trigger a DMA transfer, inspect the output vector that the DMA wrote out. You can use Nios II for this if the vectors are small enough to fit into main memory, otherwise you could use something like system console to populate the vectors in memory (and even trigger the DMA to operate since Nios II would be optional at that point).
0 Kudos
Altera_Forum
Honored Contributor II
399 Views

Until now I added the modular SGDMA to SOPC Builder by using the supplied *.tcl scripts. However I cannot connect the datapath sink/source without errors. 

 

If I wire the read source directly to the write sink, I get the following error: 

 

error: dma_read_master_0.data_source/dma_write_master_0.data_sink: the source has a channel signal of 8 bits, but the sink does not. 

 

But I already disabled 'Channels' in the configuration for the read master. 

 

When I put in a custom 'Direct-Through'-interface build with DSPBuilder I get the following errors (for now it basically just routes the signals directly from a sink to source block): 

 

error: dma_read_master_0.data_source/streaming_fir_interface_0.avalon_st_sink: the source has a channel signal of 8 bits, but the sink does not. 

error: dma_read_master_0.data_source/streaming_fir_interface_0.avalon_st_sink: the source has a error signal of 8 bits, but the sink does not. 

error: dma_read_master_0.data_source/streaming_fir_interface_0.avalon_st_sink: the source has a startofpacket signal of 1 bits, but the sink does not. 

error: dma_read_master_0.data_source/streaming_fir_interface_0.avalon_st_sink: the source has a endofpacket signal of 1 bits, but the sink does not. 

error: dma_read_master_0.data_source/streaming_fir_interface_0.avalon_st_sink: the source has a empty signal of 2 bits, but the sink does not. 

error: streaming_fir_interface_0.avalon_st_source/dma_write_master_0.data_sink: the sink has a error signal of 8 bits, but the source does not. 

error: streaming_fir_interface_0.avalon_st_source/dma_write_master_0.data_sink: the sink has a startofpacket signal of 1 bits, but the source does not. 

error: streaming_fir_interface_0.avalon_st_source/dma_write_master_0.data_sink: the sink has a endofpacket signal of 1 bits, but the source does not. 

error: streaming_fir_interface_0.avalon_st_source/dma_write_master_0.data_sink: the sink has a empty signal of 2 bits, but the source does not. 

 

 

In both cases however the following error occurs: 

error: modular_sgdma_dispatcher_0.response_source: "modular_sgdma_dispatcher_0.response_source" must be connected to an avalon-st sink 

 

I'm almost sure the last error occurs because the modular SGDMA let's you choose to use either a MM or a Streaming interface and I can't find a way to disable the streaming ports from within SOPC Builder. Probably it's the same problem with the other signals also. 

 

As far as I understand, packets, error signaling, channels, etc. are optional to Avalon Streaming interfaces, but as the modular SGDMA provides the necessary ports SOPC Builder wants to assure that connected components do have this ports also. 

 

How is the SGDMA meant to be configured, when used within SOPC Builder? Disabling the respective checkboxes within the component configuration dialog of the masters (double-click on the component in SOPC Builder) didn't change anything. 

 

I also attached some screenshots. 

 

Regards, 

Sebastian
0 Kudos
Altera_Forum
Honored Contributor II
399 Views

Did you edit the modular SGDMA components in Component Editor? Those components have handwritten .tcl files which do things like disable unused ports and prevent you from using a parametrization that is not supported. If you use the read master component as is you should see something like the attached image. 

 

So just copy the /ip directory from the design example and paste it into your own hardware design, open SOPC Builder, the cores should show up in the component pool under "Modular SGDMA"
0 Kudos
Altera_Forum
Honored Contributor II
399 Views

I don't think I changed the tcl scripts, but I will make a clean check tomorrow. 

What Quartus version are you using? I don't know exactly right now but I'm using either 10.0 or 10.1. 

 

Regards, 

Sebastian
0 Kudos
Altera_Forum
Honored Contributor II
399 Views

That screenshot was from 10.1 but the modular SGDMA is tested with versions 9.1 and above. The folks that I know are using it typically use 10.0 or 10.1 

 

Also as a heads up the write master doesn't support channels so make sure the output of your filter doesn't have a channel number. The reason for this is since the data is getting stuffed into memory there isn't really anything the DMA can do with the channel information. It still would have a use if you had multiple FIR filters and wanted to use a single DMA since you would just multiplex the various channels into a single data stream and then feed that into the write master (food for thought in case you head down that road). 

 

The read master supports up to 256 channels and 256 errors. Sometimes it is handy to hijack the channel/error signals and use them as a sidebands that means something else downstream. 

 

Also I forgot to meantion if you are trying to hit really high throughput then you should align your data buffers on word boundaries, avoid bursting, turn off stride, and turn on the "full word access only" feature. On a fast device I've managed to get it running over 400MHz with a feature set like this. There are pdf files in each component directory under the /ip directory that you can read up on to find out what all those GUI options do. 

 

Cheers, 

JCJB
0 Kudos
Altera_Forum
Honored Contributor II
399 Views

Okay, now everything works fine in SOPCBuilder. But I experience a strange behaviour when I'm trying to use the DMA from C-Code through the NIOS Eclipse SBT. 

 

For the hardware design I used the SOPC Builder reference design example which was provided on one of CDs which came with the board (DE2 Dev board). I just disabled and disconnected yet unnecessary devices (e.g. VGA or Audio). I integrated the SGDMA, generated, compiled and loaded the system; 

On the software side, I'm using a simple DMA test program, which should 'DMA'-copy an array of 8 'alt_u32'. I appended the source code. The output of my test program is the following.  

 

-- DMA Test --. 

Read-address: 8455696 

Write-address: 8455736 

Preparing memory... 

Memory before transfer: 

From value 0: 1 

From value 1: 4 

From value 2: 9 

From value 3: 16 

From value 4: 25 

From value 5: 36 

From value 6: 49 

From value 7: 64 

To value 0: 0 

To value 1: 0 

To value 2: 0 

To value 3: 0 

To value 4: 0 

To value 5: 0 

To value 6: 0 

To value 7: 0 

Preparing descriptor... 

Writing descriptor, Starting DMA... 

DMA done... 

To value 0: 1 

To value 1: 4 

To value 2: 9 

To value 3: 16 

To value 4: 25 

To value 5: 0 

To value 6: 0 

To value 7: 0 

Test read, a few seconds later: 

To value 0: 1 

To value 1: 4 

To value 2: 9 

To value 3: 16 

To value 4: 25 

To value 5: 0 

To value 6: 0 

To value 7: 0 

 

As you see, not all values are getting copied. The values which are copied change, if I change something in the code (uncommenting/commenting some printfs e.g.), but for one 'code-version' stay the same for every program execution. Any idea what causes this behaviour? 

 

Regards, 

Sebastian
0 Kudos
Altera_Forum
Honored Contributor II
399 Views

Looks like a cache coherency problem to me. If you have a data cache enabled for the CPU then when you write to the "From value" it might be in the data cache. Nios II implements a write back cache which means the data will only arrive at the memory when the line is evicted as a result of a miss or it is flushed. Either remap those pointers to be uncacheable or flush the data cache before triggering the DMA so that the contents will be written out to memory before the DMA attempts to move the data. 

 

http://www.altera.com/literature/hb/nios2/n2sw_nii52007.pdf
0 Kudos
Altera_Forum
Honored Contributor II
399 Views

Didn't solve the probem, I checked flushing the cache, making pointers volatile and finally disabled the data cache in the hardware design. Any other ideas? 

 

I originally thought it might be a SDRAM timing issue, but manually copying the array value for value works every time. 

 

Regards, 

Sebastian
0 Kudos
Altera_Forum
Honored Contributor II
399 Views

Copying data one at a time won't stress the memory out as much as a DMA. I recommend running this through a simulation to see if you can spot the same failure and if so look at the data moving by to see where the problem may be occurring.

0 Kudos
Altera_Forum
Honored Contributor II
399 Views

Okay, now it works. I had marked the "extended feature" checkbox in the SOPC configuration of the dispatcher core. But on the software side I used the standard descriptor. It shows that you MUST use the extended descriptor as soon as you enable the extended feature checkbox. The documentation was not 100% clear about this, only: "In order to use the extended format you must select 'enhanced features' in the dispatcher...". I probably assumed that the standard descriptor is simply defaulting to certain values, when used in "extended" mode. 

 

Second thing I noticed is that when you open the included quartus example project, the DMA is wired incorrectly. E.g. the Dispatcher-READ-Response-Sink is connected to the WRITE-master-response-port. Can you confirm this? I attached a picture... there I just opened the freshly unzipped example project. 

 

Regards, 

Sebastian
0 Kudos
Altera_Forum
Honored Contributor II
399 Views

I'm going to generate a new pdf for the user guide so I can probably improve it to make it more clear that extended descriptors must only be used when the extended features are enabled. 

 

I just found out yesterday the response ports are wired up backwards. Nobody noticed since the software still works because by the time the CPU goes to validate the data the DMA is already complete. There will be an update to the files soon (probably next week) which will correct this issue as well as add these features: 

 

- Stop on descriptor (instead of just stopping in the middle of a transfer) 

- Early done enable for reads (so that the read master doesn't have to wait for all the data to return before moving on to the next descriptor) 

 

Both of these new features will have added code in the drivers but you shouldn't have to update the application code unless you want to make use of them.
0 Kudos
Reply