Nios® II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++
Intel Support hours are Monday-Fridays, 8am-5pm PST, except Holidays. Thanks to our community members who provide support during our down time or before we get to your questions. We appreciate you!

Need Forum Guidance? Click here
Search our FPGA Knowledge Articles here.

Multiplication Optimization

Honored Contributor II



We use EP3C55f484I7 FPGA and NIOS. I work for time critical project. We use soft processor in the FPGA and in software we make some multiplications in floating point. We used FPGA multiplication custom instructions but it isn't enough at all. For example 9 multiplication and 6 addition approximately 52 us. Optimization Level = 3 in software and bsp.  


Can we calculate this calculation more faster? 


Note: NIOS ii 9.1 and Quartus ii 9.1 is being used in this project.
0 Kudos
2 Replies
Honored Contributor II

Use fixed point :-) 

I've no idea how fast the FP instructions are, but: 

Have you actually verified that the custom instructions are being executed. 

Remember they only do 'float', not 'double' - and you need to make sure everything if 'float' otherwise you'll get a lot of float<->double conversion happening. 

Does that fpga have the DSP multipler blocks in it? (and do the fp custom instructions use them if it does?). 

(Even for the integer multiply, Altera ought to give the option of throwing logic into the multiply instruction to support faster multiplies and/or 64bit results.)
Honored Contributor II

"FPGA multiplication custom instructions" sounds like the integer hardware multiplier support in the NIOS processor itself. 


Floating point support is accomplished separately from that. You need to include the floating piont custom instruction hardware in your Qsys/SOPC Builder project and connect it to your NIOS (and regenerate your BSP and recompile everything). 


In other words, it sounds like your software is currently using software emulation of floating point operations. 



I believe floating point multiply / add operations take around (6) clocks when done in hardware (ALTFP_MULT megafunction etc.)