Nios® V/II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® V/II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++
12589 Discussions

tse (Triple speed ethernet) and SGDMA (Scatter Gather DMA) do not seem fully connecte

Altera_Forum
Honored Contributor II
2,943 Views

So I've searched, and nothing seems to quite fit my case. 

 

I am using SGDMA and TSE with InterNiche Portable TCp/IP v3.1 and Simple Socket Server demo and Altera 10.0 (or very close, maybe 10.1). 

 

The interface device is apparently detected just fine (I am not including that info here, below info should make clear why I think its less than helpful), and I have hard coded the IP address to something easy (1.1.1.1, netmask 255.0.0.0). 

 

If I hard-code the system's TCP/IP MAC address into a Linux box and send a ping to the board, the ping packets indeed show up inside the TSE buffer (as evidenced by the statistics counters), but nobody seems to be able to take them out. Indeed when the SSS tries to read from the TSE, it looks to me like SGDMA is not noticing that there is data in the buffers. Or maybe it cannot write or read from the source or destination.... In any case, the SSS never notices any packets, and to all intents and purposes the system is dead. 

 

 

Some info dumped from my gdb activity is attached as the txt file. Salient portions, along with comments, below: 

 

Note that I believe that 0x30000000 is the base address of the TSE MAC 

register space, 0x30100000 is the base address of the SGDMA control, 

and that 0x30300000 is the base address of the SGDMA descriptor as  

shown below: 

(gdb) p tse_ptr->currdescriptor_ptr 

$33 = (alt_sgdma_descriptor *) 0x30300040 

 

Now, just before pinging the board: 

(gdb) x/100 0x30000000 

0x30000000: ...same... 

0x30000060: 0xffed0700 0x0000ffff 0x00000000 0x00000009 (*) 

0x30000070: 0x00000000 0x00000000 0x00000000 0x000002e5 

0x30000080: ...same... 

0x30000090: 0x00000000 0x00000009 0x00000000 0x00000000 

0x300000a0: ...same... 

0x300000b0: 0x00000387 0x00000009 0x00000000 0x00000000 

0x300000c0: 0x00000000 0x00000009 0x00000000 0x00000000 

0x300000d0: ...same from there down... 

 

just after pinging the board: 

(gdb) x/100 0x30000000 

0x30000000: ...same... 

0x30000060: 0xffed0700 0x0000ffff 0x00000000 0x0000000d (*) 

0x30000070: 0x00000000 0x00000000 0x00000000 0x00000435 

0x30000080: ...same... 

0x30000090: 0x00000004 0x00000009 0x00000000 0x00000000 

0x300000a0: ...same... 

0x300000b0: 0x0000051f 0x0000000d 0x00000000 0x00000000 

0x300000c0: 0x00000000 0x0000000d 0x00000000 0x00000000 

0x300000d0: ...same from there down.... 

 

Note the differences are highlighted above. In the past I have dug around in the documentation and convinced myself that at least one of those changes was due to the receipt of packets inside the TSE. (In fact, I believe that the 9 which changes to d (see '(*)' above) is 'aFramesReceivedOK', and indeed I sent 4 ping packets to the board) 

 

In the attached file, I also dump what I think are the SGDMA descriptor and SGDMA control areas. 

 

My current theory is that the SGDMA descriptor is somehow not set up right. But it all looks ok to me. 

 

Is the above enough to confirm or deny? Any suggestions for more data gathering or pointers to what I've got wrong? (even pointers to the Fine Manuals would be appreciated at this point!) 

 

Thanks! 

 

Rusty
0 Kudos
13 Replies
Altera_Forum
Honored Contributor II
775 Views

You can check with the debugger that the DMA driver function is called properly. 

Check also that the SGDMA is connected to the memory buffer it is supposed to write the packet to in SOPC builder. 

I think it's easier to debug the SGDMA with signaltap. At least you see what it is trying to do and can have an idea of why the packet contents isn't read.
0 Kudos
Altera_Forum
Honored Contributor II
775 Views

Ok, so after spending half an hour carefully composing my reply to myself and Daixiwen, the web page nicely says 'your token has expired, press the back button and try again after reloading' (not exact, but close)... And, of course, ALL that I had written is GONE! Bah, humbug! 

 

So, here is try two, but it will be shorter (but hopefully not much grumpier) than the one the forum decided nobody wanted to see. 

 

After much quality time with GDB, I think that the problem is a lack of connection between the SGDMA and the TSE stream. From what I can tell, SGDMA is set to use stream 0. I bet that's wrong. 

 

So my current theory is that I need to get the "core's streaming port" connected from the SGDMA to the TSE. How is that tie-in done? 

 

Thanks! (Sorry for the shortness of this and the lack of as much detail as before. If more info would help, or I'm asking it wrong, please poke me (gently) in the right direction and I'll work on doing better). 

 

Rusty
0 Kudos
Altera_Forum
Honored Contributor II
775 Views

Looking at a thing (sort of like a memory/irq/connection map) the hardware person gave me, under the 'triple_speed_ethernet_0' module I see 'transmit Avalon Streaming Sink clk_0' and 'receive Avalon Streaming Source clk_0' (as well as 'control_port Avalon Memory Mapped Slave clk_0'), so I've got the required streaming Sink/Source - I just need to know how to connect them to the SGDMA 'stream to memory' and 'memory to stream' descriptors...

0 Kudos
Altera_Forum
Honored Contributor II
775 Views

Those connections aren't controlled in software, they are made in hardware and are fixed. 

What you need to find out is the memory mapping seen from the SGDMA memory mapped masters (m_read and m_write). That should tell you what memories the SGDMAs are connected to, and at what addresses. Then in the debugger, see if the driver is filling the descriptors with a memory address in the correct range. 

The SGDMAs should send an IRQ when the transfer finished. Maybe you can test if this interrupt is fired or not. 

And by the way you were right in your first message, the value that you see changing is indeed the received frames counter. But are you sure you are using the correct MAC address?
0 Kudos
Altera_Forum
Honored Contributor II
775 Views

Update - indeed I was using a sof file which had the sgdma hooked to 'packet memory' and not DDR. For 'fun' I hacked in the debugger many of the packet mallocs into the packet memory (instead of DDR), and thus managed to get ip.c to panic with a 'not interrupt safe buffer'. 

 

Unfortunately, when I try to run the same program using my supposedly-correct sof which ties DDR to the sgdma which is associated with the tse --- the program never seems to get out of _start. 

 

When my hardware guy gets in tomorrow I'm going to try to get his most recent (or at least a better) version... Will report back here when I get it working, but for now it looks like Daixiwen was spot on - my descriptors (which were in the right place) were pointing to memory in the wrong place. 

 

Rusty
0 Kudos
Altera_Forum
Honored Contributor II
775 Views

Actually you don't need to do that much hacking because you can redefine the functions used to allocate the MBufs. Have a look at application note 440 (www.altera.com/literature/an/an440.pdf) to learn how to use a packet memory.

0 Kudos
Altera_Forum
Honored Contributor II
775 Views

Amusing. Now that we've gotten everything hooked where it belongs, I find that the initialization is copying 0xffffffff's from a bad location to where the data is actually at (that is, from __flash_rwdata_start to __ram_rwdata_start when ram_rwdata has the correct data and flash_rwdata is in nonexistent area). This has a tendency to really mess things up :-). 

 

I used the 'jump' debugger command to not execute any of those alt_load_sections in alt_load, and now I discover that 512 - 1 equals 33489407 (or 0x01ff01ff - hmm, 512 is 0x200 - probably not a coincidence) when inside of OS_TaskStkClr at line 1082 where it subtracts 1 from size. 

 

Very strange. In fact, if I set size to 511 in the debugger and then print it, I get it duplicated like that. (that is, if I set size to 0 and print it, it is a zero, but if I set it to 512 I get 0x1ff01ff). And if I try to cheat and jump over the stack zeroing I find the program executing (after a while and after I hit control-C) at 0x5c77a74c, which is to say nowhere valid. 

 

I'm going to go try to get the FPGA and the program to work together to the point that I'm not debugging '0x1ff == 0x1ff01ff' issues and I'll post here once I'm past the weird stuff. I *expect* that now that I've got sgdma and ddr and tse all hooked together in the FPGA right (and I use the correct addresses in the descriptors (and the descriptors I use are actually in the descriptor memory)) - well, then I have half a hope of DHCP being able to send a packet... 

 

Thanks - I'll be back in a while with an update (I hope!)
0 Kudos
Altera_Forum
Honored Contributor II
775 Views

You either have a problem in the communication with the debugger, or a big problem in the CPU itself... Can you ask your HW engineer to check that the FPGA design is properly constrained and meet all timing requirements?

0 Kudos
Altera_Forum
Honored Contributor II
775 Views

Update: 

 

I'll spare everyone the big story, and cut to the chase. 

 

Summary: receive works - data is transferred into memory and the program sees it :D. Transmit goes busy and never finishes :(. Never transmits anything out the ethernet port, either, for that matter. 

 

I have a question on the thing which shows what module is connected to what thing. (The graphical thing that apparently shows all the modules and their interconnections (including clocks, addresses, etc)) 

 

Mine shows: 

 

For sgdma_rx : 'csr' and 'in' have incoming arrows,  

'descriptor_read, descriptor_write, and m_write all have outgoing arrows 

for sgdma_tx: 'csr' has an incoming arrow, all others are outgoing arrows. 

Shouldn't m_read on sgdma_tx be incoming and not outgoing? (sgdma_tx's m_read is going to the DDR_Interface_0's DDR_INT.) Or does the arrow tell you something other than data direction? 

 

Thanks! 

 

Oh - the problem apparently was that someone reverted (in our revision control system) some part of the sof or the source for the sof, and so some part of the TSE got taken out. It has now been restored. However, beware that if you use optimization your variables may or may not display correctly...
0 Kudos
Altera_Forum
Honored Contributor II
775 Views

About your question, the arrows don't show the direction of the data flow, but the master/slave relationship. So even if data is going to the sgdma_tx, it is the DMA that controls the transfer and is the master on the bus. The RAM is always a slave. That's why you see an arrow from the DMA to the RAM, even if the data is actually going in the other direction.

0 Kudos
Altera_Forum
Honored Contributor II
775 Views

Rats. I suppose that would have been too easy... Will keep working on that with the hardware guy. 

 

I do have another interesting issue. If I run the SSS under the debugger, the system initializes fine (find the PHY, negotiates GigE speed, etc). 

 

However, if I download and run without the debugger it never gets in to tse_mac_init, nor does it create any tasks or say 'prepped 1 interface, initializing...' as it does when I run in the debugger and do breakpoints and such (log of that available if anyone cares. I won't attach it unless someone wants to see it). (It does get in to prep_tse_mac, and it does figure out my MAC address (which I will admit to having hacked (hard-coded) a temporary mac address into))  

 

Quality time in debugger: I set my first breakpoint at tcpport.c line 60 (which says ' e = nptcp_init(); /* call the NetPort init in nptcp.c */'), load the elf for SSS (using 'nios2-download --tcpport 2323 sss.elf'), connect with debugger once download done, and then say 'continue'. It stops there (line 60), and then I say 'next' until I get to 'allports.c' line 402. At that point, I've succeeded in finding the PHY! Is there some sort of minimum time between loading the elf and letting it run that I need to obey??? 

 

If anyone has seen any of this and/or has any ideas I'd appreciate hearing them! 

 

Thanks again! 

 

Rusty 

 

(By the way - for anyone else trying to debug the TSE - beware! If you try to single step line 213 of altera_avalon_sgdma.c which says: 

/* wait for the descriptor (chain) to complete */ 

while ( (IORD_ALTERA_AVALON_SGDMA_STATUS(dev->base) & 

ALTERA_AVALON_SGDMA_STATUS_BUSY_MSK) ) ; 

Note that you WILL lock up your gdb session until that while terminates. Should the chain never complete your debugger is frozen and you have to kill the download process to get it back. Just consider it a feature, not a bug, since I've just documented it! ;-)
0 Kudos
Altera_Forum
Honored Contributor II
775 Views

I've never used the debugger so I can't help a lot (I debug the good ol' way... LEDs and printf's :D ) but you have some really weird issues. You shouldn't have to wait before launching the application and it's hard to pinpoint the exact causes of those problems. 

Did you perform any low level verification of the board, such as a RAM test?
0 Kudos
Altera_Forum
Honored Contributor II
775 Views

My previous board did have a ram (DDR) issue, I will reload our code which has the ram test and re-run it on my new board.... 

 

"I'm an expert, that's why I get the hard problems" :eek: 

 

LEDs? Wish I had some I could use... :) (Or, as some famous person almost said once: "LEDs? We don't need no steenking LEDs!") 

 

Anyway, thanks! I'll continue 'having fun' and update here once something useful/interesting/troubling/worth_mentioning happens. 

 

Rusty 

 

As a side note - I had trouble with printf debugging - I found that I only could do a certain number of them before the system died. If it weren't for that, I had seriously considered putting the 'Fred Fish Debug' macros (http://sourceforge.net/projects/dbug/) into the SSS so I could trace the program using printfs. VERY handy tools, those - highly recommended. (I worked at a place once that had them compiled into even their production code so that, should they need to debug the production code all they had to to was run it with debug enabled)
0 Kudos
Reply