Nios® V/II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® V/II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++

Delay of TCP transmission

Altera_Forum
Honored Contributor II
1,595 Views

I use uC RTOS and Nichestack and I have a strange delay in the trasmission of TCP data. I can't understand if this is normal. 

 

I establish a connection where my pc is the client and the Nios is the server. 

The transaction is this: 

1- pc sends 6 byte of tcp data  

2- nios answers 6 bytes of tcp data  

3- after a short time (<5ms), Nios sends 7 more bytes of data. This small delay is normal because Nios waits for data coming from an upper layer. 

 

The problem is: steps 1 to 3 executes in the expected time: it takes a few ms from when pc application sends data to its tcp stack to when nios sends to the Nichestack tx function the 7 bytes. But the Ethernet frame is actually transmitted a long time after, usually 100 to 200ms later. 

 

I monitored this delay with oscilloscope through PIOs driven by Nios and probing the rx/tx signals on Ethernet PHY, so I'm pretty sure the delay is introduced by Nichestack or MAC. 

I tried to disable the Nagle algorithm with TCP_NODELAY option but nothing changes. 

 

The strange thing is that if I force step 3 from Nios (without any request from the client) the frame trasmission is nearly immediate!!! 

 

I also captured the TCP packets with Wireshark. 

time 0.000000 PC sends 6 bytes 

time 0.000378 Nios ack the 6 bytes 

time 0.002695 Nios send its 6 bytes 

time 0.109926 PC ack the 6 bytes 

time 0.110407 Nios sends the remaining 7 bytes 

time 0.310343 PC ack the 7 bytes 

 

Note that the delayed 7 bytes are transmitted just after the ack of the previous ones. 

Could it be that Nichestack delays because it waits the ack of the first 6 bytes before sending the other 7 ? 

 

Thank you in advance for any advice. 

 

Cris
0 Kudos
7 Replies
Altera_Forum
Honored Contributor II
407 Views

So you believe the delay is occuring between when the app delivers the packet to the Nichestack and when it is finally output on the wire. 

 

Well there are only a few places where delay can occur. 

 

The Nichstack will hold onto this packet until the OS signals to the Nichstack that it can run its routines. So if your application is super busy performing higher priority tasks, the Nichestack may not get an opportunity to process the packet. 

 

Eventually the Nichestack will deliver a packet to the MAC driver's transmit function. It's been a while since I examined that driver but I believe it performs a synchronous send (meaning it will transmit the packet and wait for it to finish transmitting before it moves on). If you want to debug further, you could put some code into the tse_mac_raw_send function of the driver "ins_tse_mac.c" to spit out the time or something. The driver initiates a DMA transfer to move the packet from memory to the MAC. 

 

After this it's all hardware. The only other place where delay can occur is in the TX FIFO inside the MAC. It's unlikely that this is occurring. 

 

Jake
0 Kudos
Altera_Forum
Honored Contributor II
407 Views

Hi Jake 

Thank you for your reply. 

Probably you can give me more help than any other, since I'm using the OpenCores ethernet MAC you ported to Avalon bus. 

 

 

--- Quote Start ---  

So you believe the delay is occuring between when the app delivers the packet to the Nichestack and when it is finally output on the wire. 

 

--- Quote End ---  

 

I set PIOs just before and after the send call and probe the wire output: here I see the 100 to 200ms delay. 

The only non-standard way of operation is that I use the socket in non blocking mode: could this be relevant? 

 

 

--- Quote Start ---  

 

The Nichstack will hold onto this packet until the OS signals to the Nichstack that it can run its routines. So if your application is super busy performing higher priority tasks, the Nichestack may not get an opportunity to process the packet. 

 

--- Quote End ---  

 

That's what I thought at the very beginning. So I added a lot of sleeps to leave Nichestack plenty of time to process packets. Nothing changed. 

I profiled my app and I verified it correctly releases control to other tasks: after it sends the data is always in a suspended state.  

 

 

--- Quote Start ---  

 

Eventually the Nichestack will deliver a packet to the MAC driver's transmit function. It's been a while since I examined that driver but I believe it performs a synchronous send (meaning it will transmit the packet and wait for it to finish transmitting before it moves on). If you want to debug further, you could put some code into the tse_mac_raw_send function of the driver "ins_tse_mac.c" to spit out the time or something. The driver initiates a DMA transfer to move the packet from memory to the MAC. 

 

--- Quote End ---  

 

As I said before, I don't use tse but the ocm mac. In the eth_ocm driver, as you well know, you can cofigure for sync or async send; I tried both but the delay is always there and always the same. 

 

 

--- Quote Start ---  

 

After this it's all hardware. The only other place where delay can occur is in the TX FIFO inside the MAC. It's unlikely that this is occurring. 

 

--- Quote End ---  

 

I exclude hardware problem and I think is a tcp stack issue, because of two facts: 

- as I wrote in the original post, it seems to me that the delay occurs only when there is still not acknowledged tcp data. If I transmit a single bunch of data there is no delay at all. 

- I use the same hw for an EtherCAT master, then without TCP stack: here I get very precise timings, with delays of few us between the application send and the start of frame on the wire.  

 

Regards 

Cris
0 Kudos
Altera_Forum
Honored Contributor II
407 Views

You might throw something into the eth_ocm driver to output to your LED just to verify the time between when the data is given to the TCP/IP stack and when it's delivered to the MAC driver. 

 

Jake
0 Kudos
Altera_Forum
Honored Contributor II
407 Views

 

--- Quote Start ---  

You might throw something into the eth_ocm driver to output to your LED just to verify the time between when the data is given to the TCP/IP stack and when it's delivered to the MAC driver. 

 

--- Quote End ---  

 

I have done it just now. I confirm what I wrote above: 

 

If there was a previous transmit, then tcp unacknowledged data: 

- my app call the socket send() function at time 0 

- eth_ocm_raw_send() function is called after a delay 100 to 200ms 

- I have data on wire immediately, even before the eth_ocm_raw_send returns. 

 

If there is no tcp unacknowledged data: 

- my app call the socket send() function at time 0 

- eth_ocm_raw_send() function is called after minimal delay (< 2ms) 

- again, I have data on wire immediately 

 

I further analyzed packets with Wireshark and I am almost sure the problem is definitely due to management of unacknowledged data. 

Infact, the client (PC with Win XP) delays tcp ACKs those 100 to 200ms, because of the Nagle algorithm, and nichestack transmit next data ONLY AFTER all previous data has been acknowledged. Is this a Nichestack bug? From my TCP/IP protocol knowledge, it should't behave this way. 

 

 

Cris
0 Kudos
Altera_Forum
Honored Contributor II
407 Views

Updated information. 

At the moment I found a workaround to bypass the problem in my application. 

I slightly changed the code to avoid the double answer: upon the receipt of the command data from the client, the application now waits for data from the upper layer and transmit all data with a single send() call. As I expected, the delay disappeared, at the price of a somehow more inefficient code. 

I need to test if this is good for me in any condition. 

Infact the tcp stack is working now in a very inefficient synchronous way, for the Niche stack requires every single packet to be acknowledged, instead of sending data until the tcp window goes to zero. 

A wide tcp window is useless: it behaves as if the stack zeroes the tcp window whenever it sends some data!!! 

 

Cris
0 Kudos
Altera_Forum
Honored Contributor II
407 Views

Just came across your post. I've encountered this situation several times before when communicating via TCP between a Windows computer and some kind of embedded device.  

 

It seems many (most? all?) embedded RTOSs maintain a very small TCP transmit queue. Essentially they block a new packet from going out until the previous packet has been acknowledged, regardless of the TCP_NODELAY option. Note that this blocking occurs in the RTOS, not in the MAC/hardware. Embedded RTOSs do this to minimize the size of memory buffers needed to support auto-retransmitting of TCP packets in case a packet gets lost. But the embedded system waiting for an ACK, in combination with the delayed ACKs due to the Nagle algorithm on the Windows side, can cause the long delays you were seeing. 

 

(The problem is made even worse if the Windows end of the pipe has not set the TCP_NODELAY option. In this case both transmit packets and empty ACK packets can be delayed.) 

 

One work-around is to ensure that the Windows end sends out at least one byte of dummy data every time it receives a packet from the embedded side. Windows will immediately send out an ACK along with the dummy byte (assuming you're using TCP_NODELAY), thus avoiding the Nagle delay. Note that the embedded end must be smart enough to throw away those dummy bytes. Obviously this only works if you have full control over the software on both ends of the pipe. 

 

Another work-around is to use UDP packets instead of TCP. Obviously this also works only when you control software at both ends of the pipe. 

 

Paul
0 Kudos
Altera_Forum
Honored Contributor II
407 Views

Thank you for your advice, Paul. 

In that situation I had actually overcome the problem by making something similar to what you suggest now. I implemented a communication protocol with some dummy data and other control codes in order to avoid the delay.  

The application has been operating for a few months and it has worked perfectly. 

 

Regards
0 Kudos
Reply