Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
Announcements
All support for Intel NUC 7 - 13 systems has transitioned to ASUS. Read latest update.
16511 Discussions

Inference of DSP block with accumulator does not work

Geert
Beginner
621 Views

I'm trying to infer a DSP block with accumulator for Arria 10, using Quartus Prime 17.0.

The high-level functionality I need is:

if rising_edge(clk) then
  if sload = '1' then
      out <= a * b;
  else
      out <= out + a * b
  end if;
end if;

 I started from the template provided in Quartus: VHDL/Full Designs/Arithmetic/Signed Multiply-Accumulate, but this does not work: it uses a DSP block for the multiplier, but it does not use the accumulator function.

Instead, for small word sizes, it creates a loop back path via the second multiplier inputs to bring the output back to the adder.

When I increase the accumulator width to 48, the accumulator is implemented entirely in LUTs

Any ideas how to force use of the DSP block accumulator (preferably using inference) ?

Thanks, Geert

 

 

0 Kudos
4 Replies
SengKok_L_Intel
Moderator
613 Views

Hi Greet,


If you are using an independent multiplier, could you please increase the input data width to >19 bits? If using lower than 18 bits, it will not fit into the hard accumulator.


Regards -SK Lim


0 Kudos
Geert
Beginner
606 Views

Hi,

Thanks for your answer. 

I have tried with multiple different input sizes and indeed, 2x 16-bit multiplier inputs fails (accumulator is implemented in LUTs) , while with a 16-bit + a 24-bit input, I got the expected implementation (hard accumulator).

Could you explain what the exact criterion is? Is it the multiplier result that needs to have a minimal width, or is it just sufficient that one of the multiplier inputs is > 18 bits?

regards,

 

Geert

 

 

0 Kudos
SengKok_L_Intel
Moderator
601 Views

Hi,


I use the template below and change the width to 27, and it can fit the accumulator into the hard block. Perhaps, you may try to use the ALTERA_MULT_ADD to configure the accumulator to the mode that you needed.

Please refer to Table 25 for the accumulator function:

https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/arria-10/a10_memory.pdf



// Quartus Prime Verilog Template

// Unsigned multiply-accumulate


module unsigned_multiply_accumulate

#(parameter WIDTH=27)

(

input clk, aclr, clken, sload,

input [WIDTH-1:0] dataa,

input [WIDTH-1:0] datab,

output reg [2*WIDTH-1:0] adder_out

);


// Declare registers and wires

reg [WIDTH-1:0] dataa_reg, datab_reg;

reg sload_reg;

reg [2*WIDTH-1:0] old_result;

wire [2*WIDTH-1:0] multa;


// Store the results of the operations on the current data

assign multa = dataa_reg * datab_reg;


// Store the value of the accumulation (or clear it)

always @ (adder_out, sload_reg)

begin

if (sload_reg)

old_result <= 0;

else

old_result <= adder_out;

end


// Clear or update data, as appropriate

always @ (posedge clk or posedge aclr)

begin

if (aclr)

begin

dataa_reg <= 0;

datab_reg <= 0;

sload_reg <= 0;

adder_out <= 0;

end

else if (clken)

begin

dataa_reg <= dataa;

datab_reg <= datab;

sload_reg <= sload;

adder_out <= old_result + multa;

end

end

endmodule


0 Kudos
SengKok_L_Intel
Moderator
584 Views

If further support is needed in this thread, please post a response within 15 days. After 15 days, this thread will be transitioned to community support. The community users will be able to help you with your follow-up questions.


0 Kudos
Reply