Arria DSP block merge with fixed point in OpenCL

Altera_Forum · ‎09-02-2017

Hello,

I have been trying to implement fixed point algorithms using opencl for an arria10. I am interested in this approach because I run out of DSP blocks when I used floating point so I figured I would use the 2 independent 18x18 multipliers that are in the DSP block with fixed point. I followed the tips in the programming guide and have verified in the compile reports that I am indeed using a single multiplier in the each of the DSP blocks. However it doesn't appear that the compiler is making use of the 2nd 18x18 multiplier.

I found the following and wondered if it would work so I gave it a try. It's older but I was hoping it would work

https://www.altera.com/support/support-resources/knowledge-base/solutions/rd03192013_756.html

When I hit 100% DSP utilization, logic starts to take over the additional operations. So it appears I am only using half of the 18x18 multipliers and no block merging is taking place. I am waiting on a compile to verify this. Is there any way to leverage all of the fixed point multipliers or am I doing something wrong?

Thanks,

Rudy

Altera_Forum · ‎09-02-2017

It is likely that Altera's OpenCL compiler has not been equipped with the capability to perform fixed-point multiplication in 18 × 18 Full Mode yet, or maybe it is not correctly seeing your variables as fixed-point. Which version of Quartus/AOC are you using? I noticed that v16.1 acts much more intelligently when it comes to DSP packing, compared to previous versions. If you are using an older version, I recommend trying v16.1.2. The HTML report in this version will explicitly mention what type of operation is being performed by each DSP.

Altera_Forum · ‎09-05-2017

I am using 16.1.2. The report does say that I am using multiplies instead of hardened floating point. I am going to attempt 17.0 and see what that does.

Thanks!

Rudy

Altera_Forum · ‎09-06-2017

Have you done any full compilations? Chances are, the mapper might be smart enough to pack two multiplications in the same DSP. As long as the OpenCL compiler is correctly inferring fixed-point multiplication, this could happen.

Altera_Forum · ‎09-07-2017

I have done full compilations and have verified that the compiler is inferring fixed-point multiplication. One of the reports says that the DSP usage may be different than reported and that I should look at the optimization section of the report. However it didn't seem that it was happening. In addition I couldn't fully understand what optimizations were taking place. I did try 17.0.297 today and it seemed to have worked. I am going to run some further tests but this seems to have solved my problem.

Rudy

Altera_Forum · ‎09-07-2017

And for completeness, I simply took the vector add example and changed it to a multiply while changing data types from floats to ints. Then I masked the bits according to the programming guide. This used a single DSP block as expected. When I changed to int2, this still used a single dsp block which is what led me to believe it worked. Previously with 16.1, this would have used 2 dsp blocks. I will do some more investigating as I ran out of time today.

Rudy