FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
6355 Discussions

Stream-to-memory SGDMA does not write last QWord during a transfer

Altera_Forum
Honored Contributor II
977 Views

Hi all! 

 

I am trying to perform a very simple transfer from an Avalon-ST bus to a memory. 

 

 

 

For this, I use the Stream-to-memory SGDMA. 

I make up a descriptor chain that contains only 1 descriptor. 

Descriptor and destination data buffer are statically allocated (declared as variables). 

I took care to set the destination data buffer to a 64-bit boundary address. 

 

Here is the software: 

 

 

--- Quote Start ---  

 

# include <stdio.h> 

# include <errno.h> 

# include <string.h> 

 

// Headers for accessing and configuring SGDMA devices and descriptors 

# include <altera_avalon_sgdma.h> 

# include <altera_avalon_sgdma_descriptor.h> 

# include <altera_avalon_sgdma_regs.h> 

 

// Copied from altera_avalon_tse.h 

# define IORD_ALTERA_SGDMA_DESC_STATUS(base) (((IORD(base, 0x7)) >> 16) & 0xFF) 

# define IORD_ALTERA_SGDMA_DESC_WRITE_ADDR(base) (IORD(base, 0x2) & 0xFFFFFFFF) 

# define IORD_ALTERA_SGDMA_DESC_ACTUAL_BYTES_TRANSFERRED(base) (IORD(base, 0x7) & 0xFFFF) 

 

# define BUFFER_SIZE 4096 

 

alt_u8 buffer[BUFFER_SIZE + 7]; 

alt_sgdma_descriptor descriptor; 

 

void SGDMA_isr(void * context) 

// Read SGDMA status 

fprintf(stdout, "SGDMA status: 0x%02X\n", IORD_ALTERA_AVALON_SGDMA_STATUS(SGDMA_BASE)); 

 

// Read descriptor status 

fprintf(stdout, "descriptor status: 0x%02X\n", IORD_ALTERA_SGDMA_DESC_STATUS(&descriptor)); 

 

// Read write address 

fprintf(stdout, "write address: 0x%08X\n", IORD_ALTERA_SGDMA_DESC_WRITE_ADDR(&descriptor)); 

 

// Read actual bytes transfered 

fprintf(stdout, "actual bytes transfered: %i\n", IORD_ALTERA_SGDMA_DESC_ACTUAL_BYTES_TRANSFERRED(&descriptor)); 

 

int main(int argc, char ** argv) 

alt_u8 * aligned_buffer = (alt_u8 *) (8 * ((((alt_u32) buffer) + 7) / 8)); 

alt_sgdma_dev * SGDMA_device; 

alt_u8 error_code; 

 

// Open SGDMA device 

SGDMA_device = alt_avalon_sgdma_open(SGDMA_NAME); 

if (SGDMA_device == NULL ) 

fprintf(stderr, "ERROR: Cannot open '%s' SGDMA device (error code %d: %s)\n", SGDMA_NAME, errno, strerror(errno)); 

return -1; 

 

// Register callback 

alt_avalon_sgdma_register_callback( 

SGDMA_device, 

(alt_avalon_sgdma_callback) &SGDMA_isr, 

(ALTERA_AVALON_SGDMA_CONTROL_IE_ERROR_MSK | ALTERA_AVALON_SGDMA_CONTROL_IE_EOP_ENCOUNTERED_MSK | ALTERA_AVALON_SGDMA_CONTROL_IE_DESC_COMPLETED_MSK | ALTERA_AVALON_SGDMA_CONTROL_IE_CHAIN_COMPLETED_MSK | ALTERA_AVALON_SGDMA_CONTROL_IE_GLOBAL_MSK), 

NULL[/INDENT][/INDENT][/INDENT][/INDENT] 

); 

 

// Build descriptor chain 

alt_avalon_sgdma_construct_stream_to_mem_desc(&descriptor, NULL, (alt_u32 *) aligned_buffer, 0, 0); 

 

// Start SGDMA 

error_code = -alt_avalon_sgdma_do_async_transfer(SGDMA_device, &descriptor); 

if (error_code) 

fprintf(stderr, "ERROR: Cannot start SGDMA device (error code %d: %s)\n", error_code, strerror(error_code)); 

return -1; 

 

// Infinite loop 

while (1); 

 

fprintf(stderr, "ERROR: This code should never be executed\n"); 

return 0;[/INDENT] 

 

--- Quote End ---  

 

 

This system suffers from several things: 

- It only receives 1 packet from the Avalon-ST bus since SGDMA is not restarted in ISR 

- I do some printf in the ISR 

- I use global variables so that they are accessible from both main and the ISR instead of using a context 

but this is no problem since I am still in the process of debugging. 

 

 

 

 

When I run the system and send a packet through the Avalon-ST, I get the following traces: 

 

 

--- Quote Start ---  

nios2-terminal: connected to hardware target using JTAG UART on cable 

nios2-terminal: "USB-Blaster [USB 6-1.2]", device 1, instance 0 

nios2-terminal: (Use the IDE stop button or Ctrl-C to terminate) 

 

SGDMA status: 0x0E 

descriptor status: 0x80 

write address: 0x0800F2A8 

actual bytes transfered: 40 

--- Quote End ---  

 

 

which is normal : 

- bits active in SGDMA status correspond to EOP_ENCOUNTERED, DESCRIPTOR_COMPLETED and CHAIN_COMPLETED 

- bit active in descriptor status corresponds to TERMINATED_BY_EOP 

- actual_bytes transfered = 40 is the correct size of the packet I sent on the Avalon-ST bus 

 

 

 

Unfortunatly, when I dump the destination data buffer, I see that the last 8-byte word of the packet was not written. 

So I use signaltap and looked at the SGDMA signals. 

Here are the waves (http://img8.imageshack.us/img8/396/sgdmahwbug.png): 

 

http://img8.imageshack.us/img8/396/sgdmahwbug.png  

 

As can be seen, after each valid 8-byte word on the Avalon-ST bus, the SGDMA performs a write on its DMA interface with the corresponding data (except bytes have been reordered) and byte-enable = "11111111". 

Address starts from 0x0800F2A8 (which I checked to be the address of the destination data buffer) and then is incrmented by 8 after each write. 

After the last word of the packet is received, ready is deasserted because the SGDMA has some work to do, starting with updating the descriptor status. So on cycle 28, the SGDMA performs a write on its descriptor interface, at address 0x0800F2A4 (which I checked to be the address of the descriptor "control", "status" and "actual_bytes_transferred" fields). So we have status = 0x80 and actual_bytes_transferrede = 0x0028 = 40, which is correct. 

 

The problem is that when the SGDMA writes the last word of the packet (cycle 21), that was flagged with EOP, the signal byte-enable is not set to "11111111" as for the previous words, it remains at "00000000". That's why the last word is not written into the memory. 

 

 

 

This looks like a bug in the SGDMA to me, because I think i did everything right, plus the SGDMA itself thinks it has transfered 40 bytes since it sets actual_bytes_tranferred to 40.  

 

 

 

Has anyone been confronted to this? 

Is it due to the fact that there are cycles with valid = 0 in the middle of the packet ? 

Has it been corrected? 

Am I wrong? 

 

 

 

Thanks a lot! 

 

 

 

- Julien
0 Kudos
8 Replies
Altera_Forum
Honored Contributor II
265 Views

First you should check that you don't have any problems related to the CPU data cache when you read back the memory data. Ensure that you are using uncached access or that you invalidate the data cache on that memory area before you read. 

Other than that I don't see anything irregular on your waves that would explain why the last word wouldn't be written. The only strange thing I see in your code in the use of a NULL pointer instead of a second descriptor to complete the chain. In most of the example codes they put a second descriptor chained to the first, even if it is just here to stop the SGDMA. I don't know how the DMA reacts to a NULL pointer as next descriptor.
0 Kudos
Altera_Forum
Honored Contributor II
265 Views

Thanks for the answer. 

I use a Nios II/e CPU, so there is no cache. 

I will try to use a second descriptor as plug.
0 Kudos
Altera_Forum
Honored Contributor II
265 Views

Unfortunatly adding a descriptor plug did not change anything... 

Any other idea?
0 Kudos
Altera_Forum
Honored Contributor II
265 Views

Problem solved. 

 

 

 

SGDMA input port "in_empty[2:0]" came from a module where it was actually never assigned. 

 

According to me, it should have been stuck to "000". BTW this is what it looked like into the waves. 

 

 

 

but, for some reasons I still don't get, there was a side effect: 

 

 

 

Here are the waves (http://img233.imageshack.us/img233/9348/bugma.png) before corrections: 

 

http://img233.imageshack.us/img233/9348/bugma.png  

 

And the waves (http://img845.imageshack.us/img845/7775/nobug.png) after correction: 

 

http://img845.imageshack.us/img845/7775/nobug.png  

 

These waves display signal signal m_write_byteenable_reg

 

According to sgdma.vhd, line 1946, this signal only depends on signals sink_stream_empty and shifti

 

m_write_byteenable_reg <= A_EXT( ( ( ( ( ( ( ( (A_WE_StdLogicVector((((std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000111"))), (std_logic_vector'("000000000000000000000000") & (shift7)), std_logic_vector'("00000000000000000000000000000000"))) OR (A_WE_StdLogicVector((((std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000110"))), (std_logic_vector'("000000000000000000000000") & (shift6)), std_logic_vector'("00000000000000000000000000000000"))) ) OR (A_WE_StdLogicVector((((std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000101"))), (std_logic_vector'("000000000000000000000000") & (shift5)), std_logic_vector'("00000000000000000000000000000000"))) ) OR (A_WE_StdLogicVector((((std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000100"))), (std_logic_vector'("000000000000000000000000") & (shift4)), std_logic_vector'("00000000000000000000000000000000"))) ) OR (A_WE_StdLogicVector((((std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000011"))), (std_logic_vector'("000000000000000000000000") & (shift3)), std_logic_vector'("00000000000000000000000000000000"))) ) OR (A_WE_StdLogicVector((((std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000010"))), (std_logic_vector'("000000000000000000000000") & (shift2)), std_logic_vector'("00000000000000000000000000000000"))) ) OR (A_WE_StdLogicVector((((std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000001"))), (std_logic_vector'("000000000000000000000000") & (shift1)), std_logic_vector'("00000000000000000000000000000000"))) ) OR (A_WE_StdLogicVector((((std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000000"))), (std_logic_vector'("000000000000000000000000") & (shift0)), std_logic_vector'("00000000000000000000000000000000"))) ), 8); 

 

code is ugly, but it corresponds to: 

if (sink_stream_empty = "000") then m_write_byteenable_reg <= "11111111"; if (sink_stream_empty = "001") then m_write_byteenable_reg <= "01111111"; if (sink_stream_empty = "010") then m_write_byteenable_reg <= "00111111"; if (sink_stream_empty = "011") then m_write_byteenable_reg <= "00011111"; if (sink_stream_empty = "100") then m_write_byteenable_reg <= "00001111"; if (sink_stream_empty = "101") then m_write_byteenable_reg <= "00000111"; if (sink_stream_empty = "110") then m_write_byteenable_reg <= "00000011"; else m_write_byteenable_reg <= "00000001"; end if; 

 

For some reasons, shifti signals are not correctly displayed (wrong values and signal name in red), but still according to sgdma.vhd, lines 1931 to 1952: 

 

all_one <= std_logic_vector'("11111111"); shift7 <= Std_Logic_Vector'(std_logic_vector'("0000000") & A_ToStdLogicVector(all_one(0))); shift6 <= Std_Logic_Vector'(std_logic_vector'("000000") & all_one(1 DOWNTO 0)); shift5 <= Std_Logic_Vector'(std_logic_vector'("00000") & all_one(2 DOWNTO 0)); shift4 <= Std_Logic_Vector'(std_logic_vector'("0000") & all_one(3 DOWNTO 0)); shift3 <= Std_Logic_Vector'(std_logic_vector'("000") & all_one(4 DOWNTO 0)); shift2 <= Std_Logic_Vector'(std_logic_vector'("00") & all_one(5 DOWNTO 0)); shift1 <= Std_Logic_Vector'(A_ToStdLogicVector(std_logic'('0')) & all_one(6 DOWNTO 0)); shift0 <= all_one; 

 

shift0 = "11111111", 

shift1 = "01111111", 

shift2 = "00111111", 

... 

shift7 = "00000001", 

 

 

 

So I don't understand why m_write_byteenable_reg signal behaves differently since in both cases signlas in it equation behave the same... 

 

 

 

m_write_byteenable_reg = "00000000" is only possible if the following conditions are all false: 

 

 

(std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000111") (std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000110") (std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000101") (std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000100") (std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000011") (std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000010") (std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000001") (std_logic_vector'("00000000000000000000000000000") & (sink_stream_empty)) = std_logic_vector'("00000000000000000000000000000000") 

 

which I don't think is possible... 

 

 

 

Has anyone any idea about that? 

 

 

 

- Julien
0 Kudos
Altera_Forum
Honored Contributor II
265 Views

 

--- Quote Start ---  

According to me, it should have been stuck to "000". 

--- Quote End ---  

 

Actually this isn't true.The signal will be set to "UUU" in a simulator, which means "uninitialized" and a synthesizer may decide to choose whatever value or expression it wants to optimize the synthesis, potentially giving strange results. I don't know how Quartus works in this case, but looking at the RTL viewer could help find out what it did. You may get a warning message somewhere about that. Anyway if you try and simulate your SOPC system with this uninitialized vector you'll probably end up with some 'X's on several control signals from the DMA. 

 

--- Quote Start ---  

BTW this is what it looked like into the waves. 

--- Quote End ---  

I don't know how signaltap works in this case, but if the hole part that processes in_empty was optimized away by Quartus, it may just have put a default value to signaltap, unrelated to what the DMA does. Again the RTL viewer may shed some light on this.
0 Kudos
Altera_Forum
Honored Contributor II
265 Views

 

--- Quote Start ---  

 

--- Quote Start ---  

According to me, it should have been stuck to "000". 

--- Quote End ---  

Actually this isn't true.The signal will be set to "UUU" in a simulator, which means "uninitialized" and a synthesizer may decide to choose whatever value or expression it wants to optimize the synthesis, potentially giving strange results. I don't know how Quartus works in this case, but looking at the RTL viewer could help find out what it did. You may get a warning message somewhere about that. Anyway if you try and simulate your SOPC system with this uninitialized vector you'll probably end up with some 'X's on several control signals from the DMA. 

--- Quote End ---  

 

 

'U', 'X', etc. are for simulation. 

In the 'real', 'physical' world, whatever the synthesizer choosed to do, signal in_empty[2:0] can only be "000", "001", "010", "011", "100", "101", "110" or "111". 

So 1 of the 8 conditions from sgdma.vhd, line 1946, has to be true, so m_write_byteenable_reg should not be "00000000". 

 

Using the RTL viewer is a good idea, I'll do that. 

 

 

--- Quote Start ---  

 

--- Quote Start ---  

BTW this is what it looked like into the waves. 

--- Quote End ---  

I don't know how signaltap works in this case, but if the hole part that processes in_empty was optimized away by Quartus, it may just have put a default value to signaltap, unrelated to what the DMA does. Again the RTL viewer may shed some light on this. 

--- Quote End ---  

 

 

It is indeed possible that in_empty[2:0] has been synthesized "000" for signal_tap and differently somewhere else.
0 Kudos
Altera_Forum
Honored Contributor II
265 Views

 

--- Quote Start ---  

In the 'real', 'physical' world, whatever the synthesizer choosed to do, signal in_empty[2:0] can only be "000", "001", "010", "011", "100", "101", "110" or "111". 

So 1 of the 8 conditions from sgdma.vhd, line 1946, has to be true, so m_write_byteenable_reg should not be "00000000". 

--- Quote End ---  

I'm pretty sure the synthesizer won't choose a value for you, but will just optimise away anything that only depends on this vector if you didn't initialize it. The actual outputs that depends on this vector can therefore take any value, depending on the optimisations that have been done.
0 Kudos
Altera_Forum
Honored Contributor II
265 Views

Yeah I think that since in_empty[2:0] is not driven 

 

- the synthetizer considers each of the 8 conditions false, which explains why I eventually get byteenable = "00000000" 

 

- the synthetizer set signaltap.in_empty[2:0] = "000" 

 

independently... 

 

Thank you for your help :)
0 Kudos
Reply