Nios® V/II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® V/II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++
12665 Discussions

Altera TSE driver and example program for lwIP (1.3.2)

Altera_Forum
Honored Contributor II
27,029 Views

After many many requests and complaints about lack of support and/or documentation for support of lwIP for the Altera TSE, I have developed a drop-in TSE driver and example program and made this available to the NIOS II community. This was done for NIOS II 8.1 SP0.01. I don't expect difficulty with version 9.x. 

 

This is for the latest version of lwIP (the latest is as of this post) for a minimal program and HTTP server based on the http server in the lwIP contrib folder. The lwIP TSE driver uses the altera_avalon_tse driver and SGDMA as-is. There is a complete (as in 41-step) set of instructions on creating the project and example program. More information and the link to the driver is available here: 

 

http://lwip.wikia.com/wiki/available_device_drivers#lwip_1.3.2 

 

Please direct any questions, changes for NIOS II 9.1, or comments to this thread. 

 

12-16-2010 update: This example works with NIOS Version 10.0 with some tweaks to the procedure to create the project. Also, a lwIP 1.4 release candidate has been out for a while and it drops into this example (in place of 1.3) without changes. 

 

Bill
0 Kudos
257 Replies
Altera_Forum
Honored Contributor II
1,087 Views

 

--- Quote Start ---  

This is tough to answer because it's application dependent and very dependent on available memory. 

 

I can say for our products that send a lot of data (images) we used a custom UDP protocol because TCP was under 100MBS while UDP can get upwards of 500MBS. (these are bits per second) 

 

For our product that receives a lot of data and sends very little, we set the MAC for 100MBS because at 1000MBS the packet rate is so fast the Cyclone III cannot keep up with the interrupt rate. So it drops packets and actually slows down the link. We found at 100MBS there are no dropped packets and we can run the product at its max speed. It was probably faster at 1000MBS but I prefer an error-free transport and 100MBS is enough. It also spread out the receive over more time which I thought was better. 

 

Take care, 

Bill 

--- Quote End ---  

 

 

For your product transmitting a lot, may i know what limited its TCP throughput under 100Mbps? Is the Nios II handling TCP/IP stack only or other tasks as well? 

My project's target is to achieve more than 200Mbps on TCP transmission with GbE. If i let Nios II handling raw mode TCP/IP stack only, do you think if it is possible to achieved the desired throughput? 

Thank you. 

 

Twenty
0 Kudos
Altera_Forum
Honored Contributor II
886 Views

 

--- Quote Start ---  

For your product transmitting a lot, may i know what limited its TCP throughput under 100Mbps? Is the Nios II handling TCP/IP stack only or other tasks as well? 

My project's target is to achieve more than 200Mbps on TCP transmission with GbE. If i let Nios II handling raw mode TCP/IP stack only, do you think if it is possible to achieved the desired throughput? 

--- Quote End ---  

 

 

The NIOS II was basically running lwIP 99% of the time - but CPU time is needed to transfer the image to memory. 

 

We were unable to clock the NIOS II faster than 100Mhz - I believe it should have worked at 125MHz or faster. lwIP is inefficient in places although I contributed code to speed up what I could. Turns out the TCP checksum was a large consumer of time (overall for lwIP) because it has to touch all data. When we switched to putting the data in memory (in hardware) in TCP packet sized chunks *with* the checksum appended, speeds went up some 50%. The checksum at the end was added to the TCP header checksum to get the final checksum. I also put several of the most often called lwIP functions in on chip memory - that made a *huge* difference. A post earlier in this thread lists all of the optimizations I made in the order of impact. 

 

200MbS? Maybe if you're running faster than I am with faster memory and you speed up the checksum (in hardware is best). We have no-copy TCP (UDP) writes too - if you're not using zero copy I would say you won't reach 200MbS. Unfortunately your question can only really be answered by implementing it. 

 

BillA
0 Kudos
Altera_Forum
Honored Contributor II
886 Views

 

--- Quote Start ---  

The NIOS II was basically running lwIP 99% of the time - but CPU time is needed to transfer the image to memory. 

 

We were unable to clock the NIOS II faster than 100Mhz - I believe it should have worked at 125MHz or faster. lwIP is inefficient in places although I contributed code to speed up what I could. Turns out the TCP checksum was a large consumer of time (overall for lwIP) because it has to touch all data. When we switched to putting the data in memory (in hardware) in TCP packet sized chunks *with* the checksum appended, speeds went up some 50%. The checksum at the end was added to the TCP header checksum to get the final checksum. I also put several of the most often called lwIP functions in on chip memory - that made a *huge* difference. A post earlier in this thread lists all of the optimizations I made in the order of impact. 

 

200MbS? Maybe if you're running faster than I am with faster memory and you speed up the checksum (in hardware is best). We have no-copy TCP (UDP) writes too - if you're not using zero copy I would say you won't reach 200MbS. Unfortunately your question can only really be answered by implementing it. 

 

BillA 

--- Quote End ---  

 

 

Thank you for your input, BillA.  

Looks like implementing TCP/IP stack in the softcore processor might not achieve the high throughput. At least a bit risky.  

I will look at alternative way like TCP/IP Offload Engine IP core as well. Thanks. 

 

Twenty
0 Kudos
Altera_Forum
Honored Contributor II
886 Views

Hi BillA, 

 

I'm trying to implement LWIP in Quartus/NIOS 13.1, the example files was downloaded from https://github.com/engineeringspirit/freelwip-nios-ii

When building the project, I got error message as following: 

 

 

**** Build of configuration Nios II for project LWIP **** 

 

make all  

Info: Building D:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp/ 

make --no-print-directory -C D:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp/ 

[BSP build complete] 

Info: Compiling ethernet.c to obj/default/ethernet.o 

nios2-elf-gcc -xc -MP -MMD -c -ID:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp//drivers/inc -ID:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp//LwIP/inc -ID:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp//LwIP/inc/ipv4 -ID:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp//FreeRTOS/inc -ID:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp//HAL/inc -ID:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp/ -ID:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp//drivers/inc -DSYSTEM_BUS_WIDTH=32 -pipe -D__hal__ -DALT_NO_INSTRUCTION_EMULATION -DALTERA_TRIPLE_SPEED_MAC -DALT_LWIP -D__freertos__ -O0 -g -Wall -mhw-div -mhw-mul -mno-hw-mulx -o obj/default/ethernet.o ethernet.c 

ethernet.c: In function 'StatusCallback': 

ethernet.c:44:2: error: too few arguments to function 'print_ipad' 

In file included from ethernet.c:9:0: 

D:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp//LwIP/inc/lwip_main.h:139:20: note: declared here 

ethernet.c:45:2: error: too few arguments to function 'print_ipad' 

In file included from ethernet.c:9:0: 

D:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp//LwIP/inc/lwip_main.h:139:20: note: declared here 

ethernet.c:46:2: error: too few arguments to function 'print_ipad' 

In file included from ethernet.c:9:0: 

D:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp//LwIP/inc/lwip_main.h:139:20: note: declared here 

ethernet.c: In function 'InitNetwork': 

ethernet.c:65:2: error: too few arguments to function 'lwip_initialize' 

In file included from ethernet.c:9:0: 

D:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp//LwIP/inc/lwip_main.h:147:13: note: declared here 

ethernet.c: In function 'WaitOnPHY': 

ethernet.c:101:4: warning: implicit declaration of function 'usleep' [-Wimplicit-function-declaration] 

ethernet.c: In function 'xEthernetRun': 

ethernet.c:193:2: warning: 'return' with a value, in function returning void [enabled by default] 

ethernet.c:146:6: warning: unused variable 'nDisconnectCnt' [-Wunused-variable] 

ethernet.c:143:14: warning: unused variable 'pmac' [-Wunused-variable] 

ethernet.c: In function 'get_ip_addr': 

ethernet.c:224:3: error: too few arguments to function 'print_ipad' 

In file included from ethernet.c:9:0: 

D:/works/FPGA/GRJCS331_EPL_DDR/test07/LWIP/LWIP_bsp//LwIP/inc/lwip_main.h:139:20: note: declared here 

make: *** [obj/default/ethernet.o] Error 1 

 

**** Build Finished **** 

 

Please help me, Thank you.
0 Kudos
Altera_Forum
Honored Contributor II
886 Views

Hi BillA, 

 

I try to port the altera tse driver and example program for lwip (1.3.2) (http://www.alteraforum.com/forum/showthread.php?t=23787) on 15.1. 

 

But the 15.1 use Nios II Gen2 core and has different uncached memory regions with Gen1. 

 

https://www.altera.com/support/support-resources/knowledge-base/solutions/rd07072014_334.html 

 

How can I fix the function "alt_remap_uncached" be used on alteraTseEthernetif.c & lwip_tse_mac.c ? 

 

thanks!
0 Kudos
Altera_Forum
Honored Contributor II
886 Views

Hi, 

 

I'm using the the alt_remap_uncached() function from Nios Gen1 with minor changes: 

 

# include "sys/alt_warning.h" 

# include "sys/alt_cache.h" 

# include "system.h" 

 

 

volatile void*  

alt_remap_uncached(void* ptr, alt_u32 len) 

 

# ifdef NIOS2_MMU_PRESENT 

/* Convert KERNEL region address to IO region address */ 

# define BYPASS_DCACHE_MASK (0x1 << 29) 

# else 

/* Set bit 31 of address to bypass D-cache */ 

# define BYPASS_DCACHE_MASK (0x1 << 31) 

# endif 

const size_t num_lines = (len + ALT_CPU_DCACHE_LINE_SIZE - 1) / ALT_CPU_DCACHE_LINE_SIZE; 

const size_t aligned_size = num_lines * ALT_CPU_DCACHE_LINE_SIZE; 

 

alt_dcache_flush (ptr, aligned_size); 

return (void*) (((alt_u32) ptr) | BYPASS_DCACHE_MASK); 

 

In my case it works. May be it helps you 

 

Jens
0 Kudos
Reply