Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16593 Discussions

Reduce logic utilization

Altera_Forum
Honored Contributor II
1,609 Views

Hi,  

 

I have this part in my kernel where it takes too much logic 

if(relu == 1){ if(out < 0 ) conv_in = 0.1*out; else conv_in = out; }  

 

out is a float data. The report.html shows me it taking 4k aluts and 8k ff for this function which is too much for my de1soc to handle. Any idea how to reduce it?  

Btw, the function is a leaky activation function where negative data will mutliply by 0.1. 

Thanks in advance. 

 

EDIT: 

Whats the ups and downs in using these two compiler flag. 

1) -fp-relaxed 

2) -fpc
0 Kudos
6 Replies
Altera_Forum
Honored Contributor II
563 Views

Since floating-point operations are not natively supported by the DSPs in Cyclone V, for floating-point multiplication, multiplication of mantissa will use DSPs but all other operations including shifting (with barrel shifters) and rounding will use logic and FF. This is expected behavior and cannot be avoided unless you give up on IEEE-754-compliance. 

 

--fp-relaxed will allow parallelizing of floating-point operations in form of a tree that requires reordering of operations. This could slightly reduce the logic/FF overhead at the cost of small changes in the output. However, this might not necessarily make any difference in your kernel unless you have chained floating-point operations. 

 

--fpc can significantly reduce logic and FF overhead of floating-point operations by reducing the area spent on rounding functions, at the cost of losing compliance with the IEEE-754 standard; i.e. if you use that switch, you could get very different (i.e. inaccurate) results compared to running the same code on a CPU/GPU. 

 

Another option you have is to use fixed-point numbers. Altera's documents outline how you can use bit masking to convert floating-point numbers to fixed-point in an OpenCL kernel.
0 Kudos
Altera_Forum
Honored Contributor II
563 Views

jack12, try to replace "conv_in = 0.1*out" to "conv_in = 0.125*out" or "conv_in = 0.125*out - 0.03125*out" for more precision -- these expressions is easier.

0 Kudos
Altera_Forum
Honored Contributor II
563 Views

The kernel is mainly doing floating point convolutions repeatedly. Anyway, i will try to verify my result and compare my result with the compiler flags on. Thanks HRZ

0 Kudos
Altera_Forum
Honored Contributor II
563 Views

Hi WitFed,  

 

I am trying to reduce the logic utilization, as its can not fit into FPGA design. I am confused why creating conv_in = 0.125*out - 0.03125*out will reduce the logic utilization? Shouldnt it be using more logic in subtractor ?
0 Kudos
Altera_Forum
Honored Contributor II
563 Views

because there is no 0.1 in hardware. 

if you use 0.1, compiler will use a lot of hardware to implement a number as close as 0.1, 

however if you use (0.125-0.03125)*out, it's like((1>>3)-(1>>5))*out
0 Kudos
Altera_Forum
Honored Contributor II
563 Views

I see, Thanks aazz44ss.

0 Kudos
Reply