FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
6343 Discussions

use descriptor chain in SGDMA

Altera_Forum
Honored Contributor II
3,652 Views

Hi all 

 

My current goal is to use SGDMA (ST-to-MM) to transfer video data to a memory. 

Later on, I'll replace the memory by PCIE IP. 

 

My problem is how to use descriptor chain in SGDMA. 

In Qsys, the descriptor read and write signals of SGDMA are connected to an on-chip memory (mem A). 

The Avalon-ST sink is connected to a test pattern generator. 

The m_write signal which is an Avalon-MM master is connected to another on-chip memory (mem B ). 

Basically, my idea is to transfer data from test pattern generator to mem B. 

 

In Nios II, I write a simple C program doing the following things: 

1) open SGDMA device with the function "alt_avalon_sgdma_open" 

2) reset SGDMA with the macro IOWR_ALTERA_AVALON_SGDMA_CONTROL 

3) register callback function with the function "alt_avalon_sgdma_register_callback" 

4) write descriptor with the function "alt_avalon_sgdma_construct_stream_to_mem_desc" 

alt_avalon_sgdma_construct_stream_to_mem_desc(&sgdma_desc, &sgdma_desc, buf, 0, 0);  

5) start sgdma transfer by calling the function "alt_avalon_sgdma_do_async_transfer" 

 

This simple program works well. 

I see a control packet in the avalon-st interface and this packet is copied to the mem B successfully. 

However, if I try to add the second descriptor, the program get stuck. 

alt_avalon_sgdma_construct_stream_to_mem_desc(&sgdma_desc, &sgdma_desc, buf, 0, 0); alt_avalon_sgdma_construct_stream_to_mem_desc(&sgdma_desc, &sgdma_desc, buf2, 0, 0);  

My program seems to be trapped when I call "alt_avalon_sgdma_construct_stream_to_mem_desc" twice. 

 

Does anyone know how to use descriptor chain in SGDMA correctly? 

 

Thanks.
0 Kudos
19 Replies
Altera_Forum
Honored Contributor II
901 Views

How did you declare your sgdma_desc table? 

It needs to be big enough to hold all your descriptors (3 in your case, because the second one needs to point to a third one that will stop the SGDMA). 

It needs to be in a memory connected to the SGDMA descriptor read and write masters (mem A) 

It needs to be in a memory that the Nios CPU can write to (i.e. the Nios' data master must be connected to mem A). 

 

What is your memory scheme for the software? which memory is used for text, data, heap, stack and bss?
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

Hi Daixiwen 

Thank you for the reply. 

 

I use an on-chip memory for storing the descriptor. 

The address of the on-chip memory is from 0x00040000 to 0x0004ffff. 

The Nios II (data_master) and SGDMA (descriptor_read and descriptor_write) are connected to the on-chip memory.  

 

I declare my descriptor table in the following code. 

sgdma_desc = (alt_sgdma_descriptor*)ONCHIP_DESC_MEM_BASE;  

ONCHIP_DESC_MEM_BASE is defined in system.h as follows: 

# define ONCHIP_DESC_MEM_BASE 0x40000  

 

For your last question, ("which memory is used for text, data, heap, stack and bss?") I don't know how to figure out these data in my program. Could you please give me some clues? 

 

Thanks.
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

You need to make sure the data read/written by the nios isn't just coming from the data cache. 

For the descriptors it may be easiest to dual port the internal memory to a tightly-coupled data port (from the nios) and the Avalon bus (for everythig else.
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

IIRC the alt_avalon_sgdma_construct_stream_to_mem_desc function ensures that the data cache is flushed correctly. 

Your settings look fine. The different memory regions (text, data, heap, stack, bss) are defined in the BSP settings but I was just asking this question in case you were using a table defined as a global variable, or a buffer allocated with malloc(). As you are using a pointers with directly the correct address as defined in system.h it should be fine. 

Could you have a look at your BSP settings nevertheless and check that you don't have any memory region configured to your on-chip memory right now? maybe your software is overwriting the descriptors.
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

Hi dsl 

 

In order to avoid the problem caused by cache data, I change my used NiosII CPU from NiosII/f to NiosII/e. 

I also try to set the 31th bit of the address or use tightly coupled memory in order to bypass the cache when I use NiosII/f. 

However, the problem is still the same. 

(I'm not sure what I've done with tightly coupled memory is correct. I do exactly the same thing as this document http://www.altera.com/literature/tt/tt_nios2_tightly_coupled_memory_tutorial.pdf

 

Anyway, I learn lots of things about how to bypass the cache data. Thank you.
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

The problem is solved. I add a delay after the SGDMA transfer and the descriptor chain works. I guess that a too frequent SGDMA transfer may cause this problem. 

 

The modified code is as follows: 

while(1) { // clear buf memset(buf, 0, sizeof(buf)); // write single descriptor alt_avalon_sgdma_construct_stream_to_mem_desc(&sgdma_desc, &sgdma_desc, buf, 0, 0); alt_avalon_sgdma_construct_stream_to_mem_desc(&sgdma_desc, &sgdma_desc, &buf, 0, 0); // enable SGDMA alt_avalon_sgdma_do_async_transfer(sgdma_dev, &sgdma_desc); usleep(10000); // must add this sleep }
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

For this design, I have a last question about the data cache. 

 

Let me describe this question first. 

When I use NiosII/e (without data cache), my SGDMA program works fine. The program copies a control packet and a video data packet to an on-chip memory by using SGDMA and descriptor chain. I can see the correct data with the following code in NiosII console. 

printf("ctrl_buf:\n"); printf("%08x %08x %08x\n", IORD(ctrl_buf, 0), IORD(ctrl_buf, 1), IORD(ctrl_buf, 2)); printf("video_buf:\n"); for(int n = 0; n < 769; n++) { if(n % 10 == 0) printf("", n); printf("%08x ", IORD(video_buf, n)); if(n % 10 == 9) { printf("\n"); } }  

 

This is how I declare variables for ctrl_buf and video_buf 

alt_u32 ctrl_buf = {0}; alt_u32 video_buf = {0};  

 

However, when I use NiosII/f (with data cache) to execute the same program, what I see in NiosII console about the data is partly different. (SGDMA transfers the same byte number of data no matter NiosII/f or NiosII/e is used. So I believe the SGDMA transfer still works well)  

 

I've already used the marco IORD to bypass the data cache so I don't know how this happens.
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

are your ctrl_bug and video_buf tables local or global variables? If they are local then they are on the stack. And in that case, as the stack is also used for other local variables, it is possible that the data cache, holding memory contents before the SGDMA transfer, is flushed back to memory after the transfer, overwriting the new contents with the old. 

Using the stack for uncached memory areas can be very tricky, and should be avoided if possible. Try to use either global variables or buffers allocated with malloc() (or even better, use alt_uncached_malloc() and then you can directly access the table contents without having to use IORD).
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

Hi Daixiwen 

Thank you for the suggestions. 

 

My design works perfect when I use the function alt_uncached_malloc() to allocate two buffers. Correct data are printed out even if I access buffers without using IORD. 

 

By the way, I declare these two buffers outside all functions in my previous program so I think they are global variables. 

I guess using the global variables may not solve this problem.
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

IIRC (from other posts here) even alt_uncached_malloc() isn't guaranteed to return memory that doesn't share a cache line with other malloced memory (or the malloc systems red-tape), nor does it ensure the cache doesn't already contain data for the address range (which might be written out later). 

 

Using globals would work - provided you mark them with __attribute__((aligned,32)) assuming 32 byte cache lines.
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

Hi dsl 

Thank you for your reply. 

 

I modify my declaration according to your suggestion. 

alt_u32 __attribute__((aligned(32))) ctrl_buf; alt_u32 __attribute__((aligned(32))) video_buf;  

I've also checked my NiosII CPU setting and found that the data cache line size is 32 bytes. 

However, in my experiment, the data in video_buf after SGDMA transfer are not completely correct...
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

the align attribute will ensure the different buffers will not end up o the same cache line, but there could still be some parts of video_buf that are cached and you can get the wrong contents. 

Try to callalt_dcache_flush ((void*)video_buff,769*sizeof(alt_u32));before the SGDMA transfer. This will write down the cache contents to video_buff, but most importantly will invalidate the cache for this area. That way you will be sure the cache won't interfere with the data. 

AFAIK Altera doesn't provide a macro to just invalidate the cache without flushing it, even if the instruction exists in assembly.
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

Last time I looked I didn't see an instruction to invalidate a cache line without writing it out - rather an ommision. 

I also remember someone porting NetBSD saying that there was a missing cache op - but that might have been a different one!
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

it is the initda instruction. Very bad name choice IMHO.

0 Kudos
Altera_Forum
Honored Contributor II
901 Views

The descriptions of the cache op instructions aren't that clear either. 

 

A missing op is the one to mark a line valid without reading from memory.
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

Can anyone of you post a sample code for sgdma, to check data transfer between onchip to sdram. 

I am a newbie and trying to figure out how can it be done, but I am unable to do it.
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

Hi Yufu, 

 

I face the problem to create a descriptor chain to receive data from network. 

could you help me to solve the problem. 

 

Regards, 

Hitesh
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

Hi Hitesh, 

 

I am now using modular SGDMA from Altera Wiki for my current project. 

So I may not able to help you in detail. 

 

You may check my attached zip file as an SGDMA example code. 

I'm not sure whether the example code is working but I was studying this code to understand Altera SGDMA.
0 Kudos
Altera_Forum
Honored Contributor II
901 Views

Hi all, 

 

I am trying to make an embedded device which will capture ethernet packets and copy to DDR through PCIe interface and vice versa . As a starting step, I am trying to develop codes on Cyclone V development board with NIOS gen 2 core. As I am on a learning phase I request all to share your knowledge on how should I proceed with my development. Please suggest any good articles or documentation on similar topics or reference examples. I am working on Altera 15.0.0.145 sdk. Any help would be well appreciated.
0 Kudos
Reply