Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16597 Discussions

Inference of DSP block with accumulator does not work

Geert
Beginner
843 Views

I'm trying to infer a DSP block with accumulator for Arria 10, using Quartus Prime 17.0.

The high-level functionality I need is:

if rising_edge(clk) then
  if sload = '1' then
      out <= a * b;
  else
      out <= out + a * b
  end if;
end if;

 I started from the template provided in Quartus: VHDL/Full Designs/Arithmetic/Signed Multiply-Accumulate, but this does not work: it uses a DSP block for the multiplier, but it does not use the accumulator function.

Instead, for small word sizes, it creates a loop back path via the second multiplier inputs to bring the output back to the adder.

When I increase the accumulator width to 48, the accumulator is implemented entirely in LUTs

Any ideas how to force use of the DSP block accumulator (preferably using inference) ?

Thanks, Geert

 

 

0 Kudos
4 Replies
SengKok_L_Intel
Moderator
835 Views

Hi Greet,


If you are using an independent multiplier, could you please increase the input data width to >19 bits? If using lower than 18 bits, it will not fit into the hard accumulator.


Regards -SK Lim


0 Kudos
Geert
Beginner
828 Views

Hi,

Thanks for your answer. 

I have tried with multiple different input sizes and indeed, 2x 16-bit multiplier inputs fails (accumulator is implemented in LUTs) , while with a 16-bit + a 24-bit input, I got the expected implementation (hard accumulator).

Could you explain what the exact criterion is? Is it the multiplier result that needs to have a minimal width, or is it just sufficient that one of the multiplier inputs is > 18 bits?

regards,

 

Geert

 

 

0 Kudos
SengKok_L_Intel
Moderator
823 Views

Hi,


I use the template below and change the width to 27, and it can fit the accumulator into the hard block. Perhaps, you may try to use the ALTERA_MULT_ADD to configure the accumulator to the mode that you needed.

Please refer to Table 25 for the accumulator function:

https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/arria-10/a10_memory.pdf



// Quartus Prime Verilog Template

// Unsigned multiply-accumulate


module unsigned_multiply_accumulate

#(parameter WIDTH=27)

(

input clk, aclr, clken, sload,

input [WIDTH-1:0] dataa,

input [WIDTH-1:0] datab,

output reg [2*WIDTH-1:0] adder_out

);


// Declare registers and wires

reg [WIDTH-1:0] dataa_reg, datab_reg;

reg sload_reg;

reg [2*WIDTH-1:0] old_result;

wire [2*WIDTH-1:0] multa;


// Store the results of the operations on the current data

assign multa = dataa_reg * datab_reg;


// Store the value of the accumulation (or clear it)

always @ (adder_out, sload_reg)

begin

if (sload_reg)

old_result <= 0;

else

old_result <= adder_out;

end


// Clear or update data, as appropriate

always @ (posedge clk or posedge aclr)

begin

if (aclr)

begin

dataa_reg <= 0;

datab_reg <= 0;

sload_reg <= 0;

adder_out <= 0;

end

else if (clken)

begin

dataa_reg <= dataa;

datab_reg <= datab;

sload_reg <= sload;

adder_out <= old_result + multa;

end

end

endmodule


0 Kudos
SengKok_L_Intel
Moderator
806 Views

If further support is needed in this thread, please post a response within 15 days. After 15 days, this thread will be transitioned to community support. The community users will be able to help you with your follow-up questions.


0 Kudos
Reply