Inferring DSP in Arria 10

The problem is no matter how I code the HDL (Verilog | VHDL) the accumulator of a signed (or unsigned) Multiply-Accumulate (MAC) is always pulled outside the resulting DSP block and implemented in ALMs as shown in fit report, Technology Map Viewer and Resource Property Editor. 


Fitter also reports: 



Warning (16067): 1 out of 1 DSP blocks in the design are not fully utilizing recommended internal DSP register banks. Design performance may be limited. To take full advantage of device resources, you should either enable the register banks directly (if using WYSIWYG entry) or provide additional registers in your design that the Quartus register packing optimization algorithm can convert to internal DSP register banks. 

Warning (16069): 1 DSP blocks are partially registered - they use some, but not all of the recommended internal DSP register banks. Intel advises using all the recommended internal DSP register banks for high performance designs. 


I have tried all code examples presented in that thread as well as 


No joy. 


Any advice would be greatly appreciated.
