FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
6356 Discussions

The performance question about Triple-speed ethernet

Altera_Forum
Honored Contributor II
989 Views

After several days hard work, I can ping my DE2-115(from terasic.com and FPAG is Cyclone IV E EP4CE115F29C7). But the performance is not good enough. 

The Ethernet data flow is that using 88E1111 as PHY IC, then connecting to the Triple-speed Ethernet(TSE) as MAC IP, then connecting to several user IPs (maked by myself. These IPs use for Etnernet protocol, like ARP, ICMP...), then back to the TSE. The interface between TSE and user IP, user IP and user IP is Avalon-ST. The Ethernet SPEED is 100 M. 

The first test result is 

PING Speed : DE2-115 with user IP=>0.33ms 

PC(with P4 3.0G CPU)=>0.3ms 

PC(with i7 870 CPU)=>0.076ms 

DE2-115 wtih uc/OS-II and InterNich stack=>0.3ms 

 

Then I speed up the clock (from 100MHZ to 200MHz) and using this 

clock on TSE and user IPs. 

The second test result is 

PING Speed : DE2-115 with user IP=>0.31ms 

 

Base on the first and second tests, I find the bottleneck is "Triple-speed Ethernet IP". So my qusetion is : 

Is there any thing which I can do to speed up the performance of Triple-speed Ethernet ? 

My Target is that the speed of PING is as same as PC(with i7 870) or better.
0 Kudos
9 Replies
Altera_Forum
Honored Contributor II
304 Views

Did you take into account the Altera AN440 tips into account ?

0 Kudos
Altera_Forum
Honored Contributor II
304 Views

The AN440 is about Niche stack. But in my case I can't use Niche. So I plane to run TCP/IP stack in FPGA. So I make some IPs for ARP,ICMP.......

0 Kudos
Altera_Forum
Honored Contributor II
304 Views

If you do the ARP/ICMP handling in hardware then something is definitely wrong in the packet handling. The TSE has a latency of only a few clock cycles so it can't be responsible for such a big delay in the ICMP response. 

You could use SignalTap to check out the stream outputs in and out of the TSE to measure your system actual time response. 

Could you describe a bit your hardware ARP and ICMP implementations?
0 Kudos
Altera_Forum
Honored Contributor II
304 Views

Even if you need to process some data in hardware, I'd suggest using doing something to 'hand off' non-critical things to software - even if you write your own (not unreasonable if you only need to do UDP).

0 Kudos
Altera_Forum
Honored Contributor II
304 Views

ARP and ICMP just a begining. If the performance is good enougth. I will do the TCP. 

 

The ARP and ICMP base on "Nios II UDP Offload Example" at alterawiki. The base struct is that getting the 32bit data from the ST-sink interface, decoding the data base the protocol, then sending the data to the next IP using the ST-source interface 

The ARP flow is : 

TSE->error_check->alignment_remove->packet_select->arp_payload_extractor->arp_checker->alignment_inserter_0 

->Multiplexer->TSE 

The ICMP flow is : 

TSE->error_check->alignment_remove->packet_select->icmp_checker->icmp_payload_inserter->alignment_inserter_1 

->Multiplexer->TSE 

The symbol "->" is Avalon-ST interface. If the name of IP is same in tow flows, it means thar is the same one. 

I use the ModelSim to simulate each IP. The data and timing diagram are right. I find that the clock which each IPs spend are small then 60. 

The I do another test. I use UDP IP as ECHO server and the PC sends data. I find the throughput is 800Mbit/s(If I use giga network). The throughput is good. 

So I don't understand why one packet spends too long but throughput is good ?
0 Kudos
Altera_Forum
Honored Contributor II
304 Views

Throughput and latency time are very different network properties, although they can be somehow related. 

Think for example if your server would be located on the other side of Earth, roughly 20000km far. 

Due to speed of light limitations, you will have at least a latency of about 130ms for every arp response. On the other hand, if you stream udp data and don't care of immediate acknowledge, you can achieve the same throughput as if your server and client would be in the same room.
0 Kudos
Altera_Forum
Honored Contributor II
304 Views

After several days hard work, I find the way to speed up the performance. 

In old way, I use Avalon-ST interface and SOPC builder to connect each IP.  

In new way, I combine all IPs into one and keep the Avalon-ST interface. (That means the source of IP_A connets to the sink of IP_B directly) 

The performance is good enough for me.
0 Kudos
Altera_Forum
Honored Contributor II
304 Views

Dear Chour, very good day,  

could you support me with any documents that lead you to apply this experiment. I searched on Alteras' tutorial and I couldn't find any of them ping ip or nearest. I wish it is not offensive for you to publish a sample from your project for sharing knowledge.  

Thanks in advance.
0 Kudos
Altera_Forum
Honored Contributor II
304 Views

dear chour,  

I'm using de2-115 

I've manage to do bwping (bandwidth ping). however, the results aren't that good. 6 mbps as max result i got.  

Besides the throughput couldn't be max than 10 mb.  

How did you manage to score that input could you please support us with a brief details/ steps.  

I'm still struggling with your previous posts. some of them seems unfamiliar to me?  

did you rise up the clock? did you use nios ii? did you optimize the ethernet (TSE) on sopc?
0 Kudos
Reply