This is a bit Off-topic but,When I use the FIR compiler to produce a simple FIR, single rate, 64 coeffs, 14 bit samples, 16 bit coeffs, either single channel or two channel, the compiler doesn't use the embedded DSP blocks. I would have assumed that the parallel implementation would have used 64 multipliers and the serial at least 1 multiplier. The newer Cyclone devices seem to have more DSP blocks so I'm surprised the FIR compiler doesn't make use of them. I am not using FFTs at the moment, so I don't know if these use the DSP blocks. I don't use MATLAB. Do the filters generated from with MATLAB use the DSP blocks?
fully parallel should create constant coefficient multipliers, which when implemented in LEs can be pretty efficient.try Variable/Fixed Coefficient: Multi-Cycle to get it to use some DSP blocks. the LE savings may not be as dramatic as you'd like to see. you might try FIR Compiler II to compare resource usage and fmax. or you might write your own FIR.
I'm not complaining about the FIR produced. It uses few LEs and runs fast.It is more a wonder why Altera put so much effort into including DSP blocks. Who uses them and what for? I did write my own FIR but it was hopless. The serial version had a max clk of 32MHz (in a cyclone II) and the parallel version had a max clock of 16MHz. It used a Wishbone interface. I had to convert Microchip generate coefficients from c to vhdl. Hence I upgraded to Quartus subscription.