Nios® V/II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® V/II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++
12590 Discussions

NIOS-II C Code optimization

Altera_Forum
Honored Contributor II
1,156 Views

I have defined a code looks like below: 

 

Style 1: 

 

function() 

variable declaration and initialization 

int N; 

N= read_iteration(); 

 

for (n=0; n< N; n++) 

if (some expression) 

do some assignments and expression computation 

 

Style 2: 

function() 

variable declaration and initialization 

int N; 

N= read_iteration(); 

 

for (n=0; n< N ; n++) 

if (some expression) 

do some assignments and expression computation 

}  

else 

do some other assignment and expression computation 

 

The 'some expression' in the 'If' clause always evaluates to TRUE in the above two Styles. However, for Style 2 the function takes about 12% extra time compared to style 1. 

Any suggestion why is this happening and how to optimize it so that the two takes approximately the same time.
0 Kudos
5 Replies
Altera_Forum
Honored Contributor II
383 Views

There isn't anything necessarily slow about the change in coding style. Assuming you're turning on the compiler optimization flags, your problem is most likely with the details of your functions. You might be able to answer your question yourself very quickly by single stepping through the assembly of the two implementations.

0 Kudos
Altera_Forum
Honored Contributor II
383 Views

Extra code is needed to jump over the else code that is present only in style 2. Branches can take significant time in some architectures. If you don't do much in the body of the if, it can b take as long as simple code like "count++" or similar. If you throw in broken pipeline caused by the extra branch, then yes it sounds reasonable.

0 Kudos
Altera_Forum
Honored Contributor II
383 Views

 

--- Quote Start ---  

Extra code is needed to jump over the else code that is present only in style 2. Branches can take significant time in some architectures. If you don't do much in the body of the if, it can b take as long as simple code like "count++" or similar. If you throw in broken pipeline caused by the extra branch, then yes it sounds reasonable. 

--- Quote End ---  

 

 

I agree. Thanks for the valuable suggestion. I don't think this could be optimized. However, if you think I am wrong, please suggest.
0 Kudos
Altera_Forum
Honored Contributor II
383 Views

 

--- Quote Start ---  

Extra code is needed to jump over the else code that is present only in style 2. 

--- Quote End ---  

 

 

This is not necessarily true. GCC is smart enough to inline (duplicate) the loop overhead such that each branch of the if/else runs to conclusion without additional branches, requiring only the initial if() 'bne' test (also required in the style 1).
0 Kudos
Altera_Forum
Honored Contributor II
383 Views

It might even be as simple as some of your code ends up sharing cache lines. 

You won't get fully deterministic behaviour unless you put all the code/data in tightly coupled memory and disable the dynamic branch predictor. 

Once you've done that the execution time of code is independent of any external state and can be counted. 

 

you could tryusing: 

if (__builtin_expect(some_condition,1)) 

that will make the 'true' part of the code the fallthrough path. 

If 'some_condition' is non-trivial you'll need to add it to the correct part. 

 

Get gcc to generate a .s file (with -S --verbose-asm) and look at the object code.
0 Kudos
Reply