Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16666 Discussions

What is/are the right SOPC compenent(s) for my project?

Altera_Forum
Honored Contributor II
3,022 Views

Hi, 

 

what I'm trying to do is the following: 

 

I have an ADC which runs at say 80 MSPs. I have a Nios II that runs at 50 (or less) MHz. 

 

I would like to have a (or more) avalon component(s) that connect to my custom ADC component and transfers the ADC data sample by sample to an onboard SRAM.  

 

In the Nios II software I would like to set up the number of samples I would like to have transferred from the ADC to the SRAM. After the transfer is done I would like to get an interrupt in the Nios II. 

 

After that the Nios II can slowly get the data out of the SRAM and can process it. 

 

I think I should use something like scatter gather DMA and streaming components. But I don't exactly know if they really suite my needs. 

 

I imagine that my ADC avalon component ist the data stream source. The scatter gather DMA reads this source and copies the data to a memory block in the SRAM. The scatter gather DMA is configured by the Nios II and interrupts it after the transfer is done. 

 

Am I going into the right direction or is there a better way to implement this functionality? 

 

It would be great if somebody can point me to the exact SOPC components and avalon interface specification I need for this task. 

 

Thanks in advance, 

Maik
0 Kudos
18 Replies
Altera_Forum
Honored Contributor II
1,288 Views

That's pretty much how I would do it. The only component you are missing is the ADC. You can probably make a component comprised of the dcfifo megafunction which is clocked at whatever your sample rate is offchip and your system speed on the other side. You would export the sampling side using a conduit and map the other side to a streaming source port that you would connect to the SGDMA in ST-->MM mode. 

 

To do the flow control on the ST side of the FIFO you can drive the ST valid signal using 'fifo not empty'. The ST ready coming into the interface you would logically AND with 'fifo not empty' and use that to drive the readack signal of the FIFO. Make sure to put the FIFO into lookahead mode so that you can keep the ST ready latency down to 0. 

 

Here is the Avalon spec for more details: http://www.altera.com/literature/manual/mnl_avalon_spec.pdf 

 

The SGDMA can trigger the interrupt at the end of each descriptor chain. Each descriptor in the chain can describe up to 64kB in payload size. If you want to send more data than that then just chain a bunch of descriptors together. Alternatively you can use this which has a simpler programming model: http://www.altera.com/support/examples/nios2/exm-modular-scatter-gather-dma.html?gsa_pos=1&wt.oss_r=1&wt.oss=modular
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

Hi BadOmen, 

 

thanks for your reply. The dcfifo hint is great and will be implemented. I just only don't find the proper ST component which I can use for my ADC component so that I will implement the Avalon-ST interface myself and will import that as user component in my SOPC system. 

 

The SGDMA example you pointed me to looks very interesting. It seems as if this isn't using the standard SGDMA which I find in my SOPC library (I don't have this "Modular SGDMA Dispatcher" module). 

I found nothing on the website or in the downloaded documentation that I'm not allowed to use these components. What do you think? Is it okay to use them in commercial projects? 

 

Maik
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

Okay, next thing I'm thinking about is that my Nios runs with =<50Mhz and the ADC/SGDMA/SRAM part of the system runs with a different clock speed (here 80Mhz). 

 

I think I need a clock crossing component in my SOPC design.  

 

Where do I have to connect this? Between the SRAM component and the CPU or between the Tristate Bridge and the CPU? I havn't found any according SRAM examples on the net . . .  

 

Maik
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

Since ADCs have many different interfaces it is not included as a standard component in SOPC Builder. Creating a new one is pretty easy though since it's just a FIFO for the most part. 

 

The modular SGDMA can be used and modified in any project. It's offered as a design example so that users have an alternative to the SGDMA that's shipped with SOPC Builder. Since it's modular you can implement variant DMAs easily by re-using the master logic and just replacing the controller (dispatcher). The master blocks are the difficult ones to implement due to all the features and the tuning that's involved so usually the customization is left in the controller portion of the DMA. If you copy the IP directory from the design example into your own project then the cores will appear the next time you open SOPC Builder in the category "Modular SGDMA". 

 

The SRAM component must be on the same clock domain as the tri-state bridge so if you want to operate the Nios II core on a different clock domain then the clock crossing bridge would need to be between the tri-state bridge and CPU (at a minimum). One question, does the CPU need to access the sampled data in memory? I ask because any time you introduce clock crossing you increase the memory read latency by around 12 clock cycles so if you are looking to process the samples you would probably be better off keeping the CPU on the same domain. Examples will be hard to find since cranking up the memory latency is normally bad for embedded systems.
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

Thanks again for the helpful conversation! 

 

Well, yes, I would like that sampled data in memory be processed by the CPU. I agree that at 80Mhz the CPU can go that pace and it would be possible to run the whole system at one speed. 

 

But what if I go a step further with my ADC/SGDMA/SRAM clock and run it at say 125MHz? I can imagine that the Nios CPU won't take that high speed at some point (in a Cyclone III device).  

 

I know that I can't process the ADC data continiously with the CPU when the CPU is slower but I can make a fast shot of a couple of thousand samples and after that I can process the data at a slower pace with the CPU (maybe a FFT implemented in software in order to save LEs that I would need for a hardware FFT module).  

 

I have relative fast signals to process but the results of the processing have not to be faster than say 10Hz. So a combination of a fast sample shot and a relativ slow processing of the data after that would be what I need. I'm right now trying to test the possibilities if this works as I imagine it. 

 

Please, feel free to give me advises if I , in your opinion, am not thinking in the right directions. 

 

Regards, 

Maik
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

125MHz is doable on Cyclone III, especially with a fast speed grade device (I've hit around 180MHz using a -6 speed grade). Since your samples come it at 80MHz, that's the maximum fill rate to the memory. So bumping up the ADC and everything else isn't really going to speed up how fast you can fill the memory since the ADC won't be able to keep up. Since you plan on doing FFT in software you'll need all the performance you can get as it's really slow that way. 

 

So if you keep the CPU and SRAM on the same domain and clocked it as high as possible that will give you the most software computational throughput. Things like streaming data, DMAs, etc... usually are not affected by latency so if you did your clock crossing there then you will be able to keep the CPU clocked fast and with low latency access to SRAM as well as isolating the other logic to some other clock domain. If you have a lot of stuff mastering SRAM you may find the fanin-out reduces the Fmax and that's where inserting pipeline bridges may be needed. Here is a doc you can take a look at for more details: http://www.altera.com/literature/hb/nios2/edh_ed5v1_03.pdf
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

Hi, 

 

okay, the show goes on . . .  

 

Today I implemented the SOPC system shown in the attachment. I designed my own ADC streaming component (ads62_avalon_st) and connected it with the SGDMA components provided in the example that BadOmen gave to me. 

 

I use the following code to try to start the SGDMA: 

# include <io.h> # include <stdio.h> # include <stdlib.h> # include "sys/alt_dma.h" # include "system.h" # include <unistd.h> # include "sys/alt_irq.h" # include "altera_avalon_spi.h" # include "altera_avalon_spi_regs.h" # include "altera_avalon_pio_regs.h" # include "i2c_defs.h" # include "e_adc_0100_firmware_defs.h" # include "descriptor_regs.h" # include "csr_regs.h" # include "response_regs.h" # include "sgdma_dispatcher.h" static volatile int rx_done = 0; char* OK = "-> OK!\n"; char* FAILED = "-> FAILED!\n"; // flag used to determine when all the transfers have completed volatile int sgdma_interrupt_fired = 0; static void sgdma_complete_isr(void *context, alt_u32 id) { sgdma_interrupt_fired = 1; clear_irq(SGDMA_DISPATCHER_CSR_BASE); } int main() { int total_error_counter = 0; int current_error_counter = 0; int testNumber = 0; unsigned int potValue = 0; unsigned int pioIn = 0; unsigned int pioOut = 0; char c = 0x00; int i = 0; sgdma_standard_descriptor a_descriptor; sgdma_standard_descriptor * a_descriptor_ptr = &a_descriptor; // using this instead of 'a_descriptor' throughout the code unsigned long control_bits = DESCRIPTOR_CONTROL_TRANSFER_COMPLETE_IRQ_MASK; alt_irq_register(SGDMA_DISPATCHER_CSR_IRQ, NULL, sgdma_complete_isr); // register the ISR enable_global_interrupt_mask(SGDMA_DISPATCHER_CSR_BASE); // turn on the global interrupt mask in the SGDMA construct_standard_st_to_mm_descriptor(a_descriptor_ptr, (alt_u32 *)SRAM_BASE, 0xff, control_bits); while (1) { c = getchar(); if (c == START_SGDMA_TRANSFER) { if (write_standard_descriptor(SGDMA_DISPATCHER_CSR_BASE, SGDMA_DISPATCHER_DESCRIPTOR_SLAVE_BASE, a_descriptor_ptr) != 0) { printf("Failed to write descriptor to the descriptor SGDMA port."); } } if (sgdma_interrupt_fired == 1) { for (i = 0; i < 0xff; i++) { printf("%x\n", IORD_32DIRECT(SRAM_BASE, i)); } sgdma_interrupt_fired = 0; } if (c == OUTPUT_RAMP_PATTERN) { printf("ADS62 outputs digital ramp.\n"); IOWR_ALTERA_AVALON_SPI_CONTROL(ADS62_SPI_BASE, ALTERA_AVALON_SPI_CONTROL_SSO_MSK); IOWR_ALTERA_AVALON_SPI_TXDATA(ADS62_SPI_BASE, 0x0002); IOWR_ALTERA_AVALON_SPI_CONTROL(ADS62_SPI_BASE, 0x0); usleep(10); IOWR_ALTERA_AVALON_SPI_CONTROL(ADS62_SPI_BASE, ALTERA_AVALON_SPI_CONTROL_SSO_MSK); IOWR_ALTERA_AVALON_SPI_TXDATA(ADS62_SPI_BASE, 0x1614); IOWR_ALTERA_AVALON_SPI_CONTROL(ADS62_SPI_BASE, 0x0); usleep(10); } if (c == SRAM_TEST) { // Test SRAM. printf("\n\nTesting SRAM with Counter 0x0 -> 0xFFFF.\n"); // Write testpattern into memory. for (testNumber = 0; testNumber < 0xffff; testNumber++) { IOWR_32DIRECT(SRAM_BASE, testNumber, testNumber); } // Read test pattern from memory. for (testNumber = 0; testNumber < 0xffff; testNumber++) { if (IORD_32DIRECT(SRAM_BASE, testNumber) != testNumber) { current_error_counter++; total_error_counter++; printf("SRAM Read Error at Address: %x\n", testNumber); } } if (current_error_counter == 0) { printf("Counter Pattern %s\n\n", OK); } else { printf("Counter Pattern with %d Errors %s\n\n", current_error_counter, FAILED); } } c = 0x00; } }  

 

Unfortunately this is not working, yet (it seems that the interrupt after the SGDMA transfer is not triggered). However, the SRAM test is working as expected.  

 

I will continue tomorrow but maybe, in the meantime, somebody has an idea that will help me to get this running. 

 

Thanks,  

Maik
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

Unless you need to readback the error/channel numbers then you probably don't need the response port enabled on the dispatcher block. It contains a FIFO so if it fills up and you don't empty it the DMA will become blocking. 

 

It looks like you are working with 32-bit data. If you setup the write master for "Full word access only" mode and passed in a transfer length of 255 bytes I could see that being problematic. That mode requires that you pass in a transfer length that is a multiple of the word size (I'm assuming 4 bytes per word in your case so 256 would be the closed valid length) Some of your code looks like you are using 0xFF as if the transfer length is in terms of words, it's actually bytes so if you wanted 0xFF transfers and the data is 32-bit use 0x3FC instead.
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

Hi, 

 

I think I'm on a good way to finally get what I need. 

 

The tip to turn the response port "off" solved the problem that nothing happened. But now, when I read the SRAM memory content triggered by the interrupt of the SGDMA, I can verify that the ADC data is written to the SRAM but with a "stride" of 4. 

 

So the ADC content (32 bit wide) is written to address 0x00, 0x04, 0x08, . . . . 

 

In between, the memory is untouched. On addres 0x01 - 0x03, 0x05 - 0x07, ... I still can read the contents that my SRAM test routine has written to the memory. 

 

I played a little bit with the parameters of the SOPC blocks, but I had no success in solving this issue. In the attachments you can see the settings of each block. 

 

Maik.
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

The part where you are reading back the data you are using IORD_32DIRECT which means you have to pass in byte offsets (and they must be a multiple of 4 since Nios II does not support unaligned accesses). i.e. use this instead: 

 

if (IORD_32DIRECT(SRAM_BASE, 4*testNumber) != testNumber) 

 

I saw some other spots where you were using IOWR_32DIRECT, you probably will have to do the same 4* there too. 

 

For example if I wanted to read back words from 'SSRAM_BASE' I would do this: 

 

IORD_32DIRECT(SSRAM_BASE, 0) 

IORD_32DIRECT(SSRAM_BASE, 4) 

IORD_32DIRECT(SSRAM_BASE, 8) 

etc...
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

So if I understand you correctly, then it is not the SGDMA who is adressing the SRAM memory in a "strange" way but it is the read back of my routine? 

 

The SGDMA placed the 32 bit of ADC data beginning at address 0x00 in the memory to 0x00-length? 

 

Regards, 

Maik
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

I believe so. If you pass in offsets 0, 1, 2, 3, 4, 5, etc... to the cache bypassing _32DIRECT macros what I would expect is aliasing/undefined behavior since the CPU and fabric do not support unaligned accesses. So the DMA in x32 bit configuration with stride turned off (stride = 1) is supposed to walk through the memory space starting at byte offset 0, then 4, 8, 12, 16, etc... 

 

The first word from your ADC will be placed at address 0x0-0x3 

The second word from your ADC will be placed at address 0x4-0x7 

The third word from your ADC will be placed at address 0x8-0xC 

etc... 

 

The last access would be at address "length-4" - "length -1" assuming the entire transfer started at address 0x0
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

This toppic bothers me so much that I have to discuss this a little bit further even if it's a quarter past 9 in the evening . . . ;-) 

 

I still don't understand this behavior. 

 

I have a SRAM memory that has 19 address bits which defines a memory addres space from 0 to 524287. My memory data sheet tells me that at each address I can put 32 bits of data (+ some bits for parity, etc). 

 

This leads to a total memory size of 524287 * 32 bits -> 16,7Mbit which is 2MByte which is exactly what I told my hardware designer what I want to have . . . 

 

So when I use my _32DIRECT macros I still belive that when I write at address SRAM_BASE + offset, that I write 32 bits of data at this very addres. And this I think I can do up to 524287. If I increase the address by 1 I really belive that I'm on the next address to write the next 32 bits. But maybe I'm totally wrong on this which I will try to confirm this, tomorrow. 

 

I also don't understand, why I can read like this from the SRAM: 

 

IORD_32DIRECT(SRAM_BASE, 0x01) 

 

and get the data that I have written with IOWR_32DIRECT(SRAM_BASE, 0x01, 0x01), wheras If I write IORD_32DIRECT(SRAM_BASE, 0x00) or IORD_32DIRECT(SRAM_BASE, 0x04) I get the full 32 bit of data the SGDMA has written.  

 

Somehow this is not clear to me . . . .
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

A x32 SRAM that has 19 address bits should cover 2MB of data storage space. They only provide 19 bits instead of 21 because they provide byte enables as well. So when the access goes off-chip to the memory the address is supplied as a word address and the individual byte lanes for each word are accessed using the byte enable signal to qualify the writes. 

 

So the behavior in SOPC Builder is that all addresses provided by a master are byte addressed. All addresses into a slave are word addressed with byte lanes controlled by the byte enable signal. This means that the access Nios II performs to the fabric should always be an address that is a multiple of 4 (i.e. the two LSBs are always low) and the byte enables control the lanes being accessed. Since the memory uses word addressing what the fabric does is takes the Nios II address line and performs a right shift of two bits and sends that into the slave port of the memory. The idea of the Avalon specification is you just need to follow it for masters and slaves and not have to care about what it is doing internally. 

 

The reason why the master needs to use byte addressing is because you can have different width masters in your system so if everything was word addressed the increments necessary for each master would map to different locations in memory. 

 

This read "IORD_32DIRECT(SRAM_BASE, 0x01)" will most likely alias into offset 0 of the SRAM because the fabric will take the 0x1 and shift it right two bits resulting in an SRAM word address offset of 0. The same aliasing would happen if you use IORD_32DIRECT for offsets 2 and 3 as well, both of those should alias into address 0 which is not what your intention is. Really at the end of the day if you stick to byte addresses in your C code, the pointers you send to the DMA, specify the transfer length in bytes (four bytes for every ADC word) you should end up with a working system.
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

Thanks again BadOmen, but honestly you got me now at a point where I have to stop and continue tomorrow . . . ;) 

 

Tomorrow, I will analyse this last post of yours line by line and at the end of the day I will have understood (hopefully) what they mean. I'm sure you are right! 

 

Maik
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

Hi, 

 

in order to understand what is really going on on the address lines to the SRAM, I made some Signal Tap measurements. The results are a basis for a new discussion: 

 

In the attachment I put three screenshots. One shows the SRAM test with the IOWR_32DIRECT macro used. I write 32 bit testdata at the first 16 addresses of the SRAM (0x00 - 0x0F). 

Everything works as expected. The SRAM is addressed from 0x00 to 0x0F and the data is succesfully written. 

 

The second screenshot shows the SGDMA transfer of also 16 write cycles. It is very impressive to see that in the view with the same scale as the IOWR_32DIRECT view, the performence is very much improved. 

 

The third screenshot is zoomed into the SGDMA transfer. There, one can see that the first two addres bits are unused which leads exactly to the behaviour that I've described earlier when I read the SRAM back with IORD_32DIRECT. 

 

BUT, I don't understand why the SGDMA behaves this way (even if I look at your last post from yesterday). My SRAM has a size of 512kx32 which I never can fully use with this kind of addressing mode the SGDAM is doing. However, if I use the _32DIRECT macros, I do can use the whole memory space. 

 

You are right that if I do the addressing in the IORD_32DIRECT macro like this: (SRAM_BASE, 4*offset), I can read the correct values that the SGDMA has written to the meory but all in all I can only use a quarter of the available memory of my system. 

 

If I'm wrong at some point or if there is a solution for this problem, please let me know. 

 

Regards, 

Maik
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

You hooked up the SSRAM off-chip address bits 18..0 it seems. The tri-state address bus is byte addressed but since you have a x32 SSRAM you need to connect the 20..2 to your 19 off-chip address lines. It happens to be working with IOWR/RD_32DIRECT because there are two issues cancelling out. 

 

The tri-state addresses are byte addressed because it needs to work with multiple off-chip devices with varying widths. See page 54 in this doc for more details: http://www.altera.com/literature/manual/mnl_avalon_spec.pdf 

 

In a nutshell you probably need to take the 21 bit SSRAM address coming out of SOPC builder and right shift it 2 bits and connect it to the SSRAM. Then you would have to fix the code performing the IORD/WR_32DIRECT accesses. You can take a look at this design to see what to do at the top level to ensure you are sending the correct address bits off-chip: http://www.altera.com/support/examples/nios2/exm-high-perf-bridge.html
0 Kudos
Altera_Forum
Honored Contributor II
1,288 Views

Ah, okay, finally a solution is in sight. 

 

Honestly, I still have problems to find myself trough the internal logic of the SOPC system, but by reading the provided documentation I think I can handle it. 

 

I also found this document which seems to be very helpful for this issue: http://www.altera.com/literature/ug/ug_sopc_builder.pdf (http://www.altera.com/literature/ug/ug_sopc_builder.pdf

 

Thanks again, 

Maik
0 Kudos
Reply