Nios® V/II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® V/II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++

Nios C-Code Size

Altera_Forum
Honored Contributor II
2,373 Views

Sorry if I am being a bit thick here. Been programming VHDL forever but C is still a little bit of a mystery to me. 

 

Can someone please tell me if there is a quick and easy way of working out how big my compiled C-Code is when I build an application for Nios. 

I am using the standard Eclipe tools. 

 

Problem is that I am very tight on memory, I don't have anything outside the FPGA and so Nios is running it's code and also workspace in Block RAM. 

Nios is doing various housekeeping functions and from time to time I need to move a few more functions in to software. 

 

Usually what happens is that I will start developing code and at some point in the day things will stop working or behave in mysterious ways.  

Of course the first thing I expect is that the changes I have done to the C-Code are responsible but every once in a while it turns out to be because I have run out of memory and presumably some variables or bits of code get over-written when the system runs. 

 

As this is always a complete waste of timeI am sure that there must be a more scientific way to work out how much space I have left in my memory or at least how big the code has become after I compile it. 

So far I can not work out where to look so any suggestions greatly appreciated.
0 Kudos
6 Replies
Altera_Forum
Honored Contributor II
915 Views

Hi, 

 

In the NIOS console you'll find a line that says something like: 

 

Info: (mem_test.elf) 26 KBytes program size (code + initialized data). 

Info: 21 KBytes free for stack + heap. 

 

If you want more detail then it's all broken down in the .map file that's created. Have a look in there and you'll find a breakdown of where the linker put everything. 

 

Mark.
0 Kudos
Altera_Forum
Honored Contributor II
915 Views

If you are tight on memory it is worth making sure your code isn't using any libarary functions that you don't explicitly need. 

IIRC even the 'small' BSP has a lot of extra stuff, the normal one is ridiculous. 

Altera ought to supply a 'minimal' example that contains absolutely nothing that isn't strictly necessary, and is designed for separate tightly coupled instruction and data memories (no caches). 

(It is a shame you can't connect the boot code (JTAG or EPCS) as tightly coupled instruction memory). 

Make sure you are compiling everything with -O2 (or -O3), not the default unoptimised - which will generate a lot more code. 

 

As well as the .map file, nios2-elf-objdump has options to show the symbol tables and disasembly of any of the object files.
0 Kudos
Altera_Forum
Honored Contributor II
915 Views

Thankyou Guys I knew that there had to be an easy way. 

 

I certainly agree with your comments about making an absolutely minimum system. I have been working with Xilinx for many years and so I am relatively new to Altera.  

In our Xilinx designs I was using a small microcontroller for all of this type of housekeeping stuff ... originally it was the KCPSM and then renamed PicoBlaze. 

 

Now don't shoot me I fully appreciate that the Pico is a completely different beast to Nios - only 8bit core and I had to write code in assembly language rather than C.  

The end result though is that even though it might take me a couple of weeks to get it to work I ended up with designs that would be nothing more than a handful of gates and 1 or 2K bytes of code. 

I thought about using a much simpler 8-bit MCU design for Altera [and there is even a code-compatible version of PicoBlaze which can be compiled for any FPGA] but I love the way that Nios2 is integrated into the workflow its very nice and convenient to use. These days I am less concerned about how many gates I am using up but the trouble is that even a very basic program needs >20K of RAM before it does anything useful and Block RAM is always in short supply. 

 

I am afraid I don't know enough about the IDE or C-code development in general to start pulling out and discarding Libraries and other such things but a reference design or even a set of instructions would be a very valuable thing !  

BTW The optimization setting was a very useful tip - took me a few minutes to find out where to change it but it saved nearly 5K of code space on this design and remarkably everything is still working !
0 Kudos
Altera_Forum
Honored Contributor II
915 Views

I run code that is compiled as a single compilation unit, and with all the functions inlined (they are small, or only called once). This helps the compiler never have to save any local variables (or other temporaries) on the stack. 

Using 'global pointer' relative addressing for all the main data and io (the 64k range covers 48k for 'memory' and 16k or 'io') also removes a lot of register pressure, cuts down code size and makes the code faster. 

I did have to fix gcc so the compiler would use gp relative addressing for structure members. 

 

Apart from working out how to get the jtag debugger (etc) to load the code, it is probably easier to do everything outside the IDE where you aren't constrained by what the IDE writers assume you want to do - which seems to be 'run the tuturial'. 

 

Maybe I should find time to put a minmal linker script on the wiki! 

This one might work - not sure if it assigns all the required sections though. 

Most of the lines save all the registers on any interrupt. 

/* Minimal linker script for a Nios cpu tightly coupled memory * * The SDRAM is mapped at it's own boundary (16M-32M) in spite of * a probably splurious warning from the SOPC builder. */ OUTPUT_FORMAT("elf32-littlenios2") OUTPUT_ARCH(nios2) /* Address of some io */ io_base = 0x20000; MEMORY { nios_code (x) : ORIGIN = 0x8000, LENGTH = 8k nois_data (rw) : ORIGIN = 0x14000, LENGTH = 12k sdram (rw) : ORIGIN = 16M, LENGTH = 16M } /* The nios_data memory (and some io) is accessible from %gp */ _gp = 128k - 16k; /* Constants for building instructions by hand */ RA_SHIFT = 32 - 5; RB_SHIFT = RA_SHIFT - 5; RC_SHIFT = RB_SHIFT - 5; OP_SHIFT = RC_SHIFT - 6; IMM16_SHIFT = 6; /* Instuctions to set %gp */ SET_GP_HI = 0 << RA_SHIFT | 26 << RB_SHIFT | ((_gp + 32768) >> 16) << IMM16_SHIFT | 0x34; SET_GP_LO = 26 << RA_SHIFT | 26 << RB_SHIFT | (_gp & 0xffff) << IMM16_SHIFT | 0x4; /* addi instructions to set %sp and %et from %gp (IMM16 offset added later) */ SET_SP = 26 << RA_SHIFT | 27 << RB_SHIFT | 0x4; SET_ET = 26 << RA_SHIFT | 24 << RB_SHIFT | 0x4; SECTIONS { /* code */ nios_code : { LONG(SET_GP_HI) LONG(SET_GP_LO) LONG(SET_SP | ((nios_stack_top - _gp) & 0xffff) << IMM16_SHIFT) LONG(SET_ET | ((nios_reg_save - _gp) & 0xffff) << IMM16_SHIFT) LONG(c_main << 4 | 1) /* jmpi c_main */ . = 0x20; /* Save all registers in area addressed by 'et' (base memory area) */ LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 0) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 1) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 2) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 3) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 4) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 5) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 6) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 7) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 8) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 9) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 10) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 11) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 12) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 13) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 14) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 15) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 16) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 17) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 18) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 19) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 20) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 21) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 22) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 23) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 24) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 25) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 26) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 27) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 28) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 29) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 30) | 0x15) LONG(24 << RA_SHIFT | ((1 << RB_SHIFT | 4 << IMM16_SHIFT) * 31) | 0x15) /* Save control registers 7 (exception cause) and 12 (badaddr) via r2 * as registers 32 and 33. */ LONG(2 << RC_SHIFT | 0x26 << OP_SHIFT | 7 << IMM16_SHIFT | 0x3a) LONG(24 << RA_SHIFT | 2 << RB_SHIFT | (4 * 32) << IMM16_SHIFT | 0x15) LONG(2 << RC_SHIFT | 0x26 << OP_SHIFT | 12 << IMM16_SHIFT | 0x3a) LONG(24 << RA_SHIFT | 2 << RB_SHIFT | (4 * 33) << IMM16_SHIFT | 0x15) LONG(. << 4 | 1) /* loopstop */ *(.code*) } >nios_code /* Some 'tightly coupled data memory' */ nios_data : { nios_reg_save = .; . = . + 256; nios_stack_top = .; *(.sdata*) *(.data*) *(.rodata) } >nios_data sdram : { *(.bss.sdram) } >sdram .comment 0 : { *(.comment) } /* Anything from any unexpected section ends up here and * generates an error because this overlaps another section */ unwanted _gp : { *(*) } }
0 Kudos
Altera_Forum
Honored Contributor II
915 Views

 

--- Quote Start ---  

 

Now don't shoot me I fully appreciate that the Pico is a completely different beast to Nios - only 8bit core and I had to write code in assembly language rather than C.  

The end result though is that even though it might take me a couple of weeks to get it to work I ended up with designs that would be nothing more than a handful of gates and 1 or 2K bytes of code. 

--- Quote End ---  

 

 

Well, assuming you had apples-apples functionality that you wanted to hand craft assembly for on NIOS, you could do it. However, you're still left with a 32-bit vs. 8-bit architecture and need to be aware that every instruction is taking 4x the space. 

 

Regarding bugs related to running out of memory, the best tip is to not use dynamic memory allocation (malloc()/free() or new/delete) and instead allocate everything statically; that way the linker tells you when you run out before you load it into the hardware and find out the hard way. 

 

You may or may not find a book like "Programming Embedded Systems with C and GNU Development Tools" somewhat helpful.
0 Kudos
Altera_Forum
Honored Contributor II
915 Views

Getting the nios to run a few k of code is not that dissicult. 

The multi-channel soft-hdlc code I use is under 3k code (and 12k of lookup tables), and written in C. It would be a little smaller in assembler, but not much. 

The problem is that Altera don't give you a sensible starting point for very small projects - the example that monitors the push buttons and changes the leds could be very small indeed.
0 Kudos
Reply