Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16899 Discussions

Pipeline parameter issues for LPM_Divide

FPGA_Newbie
Novice
1,122 Views

Hey everyone, 

im new to VHDL and Quatus Prime. Im working with ADC data and i need to calculate the RMS for it using LPM_Mult, LPM_Divide and ALTSQRT, but im having problems with pipeline values only for divide. the LPM_Divide has the following parameters:

lpm_drepresentation => "UNSIGNED",
lpm_hint => "LPM_REMAINDERPOSITIVE=TRUE",
lpm_nrepresentation => "UNSIGNED",
lpm_pipeline => 24,
lpm_type => "LPM_DIVIDE",
lpm_widthd => 22,
lpm_widthn => 24

I have read in the user guide that we cannot specify a calue for the LPM_PIPLINE parameter that is higher than LPM_WIDTHN => 24 .. and im still getting a lot of negative slacks (Please refer to the photo below for details). Any ideas and help will be appreciated

Im using Quartus Prime 18.1.
FPGA Cyclone IV.


Labels (1)
0 Kudos
10 Replies
sstrell
Honored Contributor III
1,083 Views

It's not clear that pipelining has anything to do with it.  You have over 4 ns of data delay on this failing path.  What would be really useful to see is the data arrival path section of that report to see where that extensive data delay is coming from.  Also, make sure that your design is fully constrained for timing.  Seeing your .sdc file would be useful as well.

0 Kudos
FPGA_Newbie
Novice
1,056 Views

Hey sstrell,

Thank you for your reply.

I made a screenshot of my timing analyser:
- First the statistics showing the delay coming from the cells.
- Data Arrival Path showing my clock delay 0.619 ns, then the delay for the data path 4.328 ns.
- .sdc file showing the constrained inputs and outputs, plus the generated PLL 250 MHz and set_multicycle_path for controlling the setup and hold for my outputs.

Unfortunately I can't really understand the cause of this timing error, I know it's coming from a lot of cells and ICs, but only from LPM_DIVIDE_component.

I also tried to add a component between the adder and divider to slow down the incoming data for one clock but it didn't work.

If you have any thoughts on how we can manage to control the timing error, I'd appreciate it.

thanks in advance

 

0 Kudos
sstrell
Honored Contributor III
1,021 Views

There's still more of the data arrival path that you have not shown.  What you have shown does not show the significant delay that may be causing the issue.  Scroll down to show the rest.

And it would help to show your .sdc.  Multicycle can easily screw things up if you don't use it properly.

FPGA_Newbie
Novice
969 Views

Hey sstrell,

in the zip file you'll find the .sdc file, the data path and the setup report. Hopefully this will help us find an answer to the timing error.

0 Kudos
FvM
Honored Contributor I
956 Views

Hi,
I performed test compilation of lpm_divide with given parameters and found that it can't achieve higer clock speed than about 190 MHz on Cyclone IV, speed class 6 with maximal lpm_pipeline, despite of huge resource usage. For comparison, a sequential divider, e.g. divider.v contained in stx_cookbook11/arithmetic (see GitHub - thomasrussellmurphy/stx_cookbook: Altera Advanced Synthesis Cookbook 11.0) achieves 170 MHz with less than 1/5 of lpm_divide logic resources.

Means, you are expecting too much from Cyclone IV.  Need to run the respective logic with lower clock speed. Recent low cost FPGA, e.g. Cyclone 10 LP achieves similar performance.

 

What's your ADC sampling rate? If it's really that high (250 MHz), consider an alternative RMS topology that doesn't need continuous full speed division.

FPGA_Newbie
Novice
864 Views

Hi FvM,

 

thanks for your replay and the Infos. I'm testing my design on the Cyclone IV FPGA first for other FPGA so i think you're right about expecting too much form it.

I have to use set_mulicycle_path for the outputs otherwise i'll have setup and hold violations. Is there something else i can do there ? 
- set_multicycle_path -setup -end -from [get_registers {wr_fifo:DUT10|q_wurzel_s*}] -to [get_ports {q*}] 2
- set_multicycle_path -hold -end -from [get_registers {wr_fifo:DUT10|q_wurzel_s*}] -to [get_ports {q*}] 1

I tried yesterday what the report timing closure recommendations in Timing Analyzer and it showed me this:
- turn off auto shift register replacement in Analysis & synthesis settings.
- turn on physical synthesis for combinational logic in fitter Setting.

Surprisingly it worked and now im having f_max in worst-case up to 257.4 MHz.
Is it right to change the setting like that or what ?

thanks in advance

0 Kudos
FvM
Honored Contributor I
808 Views
Hi,
it's not obvious to me that multicycle applies to the divider. It presumes that input data isn't updated every clock cycle. If so, why aren't you running the divider at a lower clock rate? To answer the question we would need to know design details.
0 Kudos
FPGA_Newbie
Novice
707 Views

Hi FvM,

 

thanks for your relpy.

Your were right about the lpm_divide speed, i tried the PLL with 4 different outputs-Freqence (100MHz, 150MHz, 200MHz and 300MHz) and im getting these timing errors for lpm_divide only when im using the 300MHz clock, max frequency it can do 220MHz with these same parameters.

 

But now im getting removal timing errors, i didn't get this before and don't know why im getting these now

Please refer to the .zip file for its contents:

- RTL Viewer

- Timing summary

- Removal 'reset' report

- .sdc file

 

PS: i got this now. it should be contacted (assignment) to the 'reset' pin hardware.

please take a look in zip-file.

 

thanks in advance

0 Kudos
RichardTanSY_Intel
230 Views

I can see that there is significant clock skew and possibly a large data delay, which might be causing the issue.

Please check the data path to see if there is a high fanout or a point where there is a substantial delay.


Could you share your design by archiving the project (Project > Archive Project) so that I can investigate it further.


Regards,

Richard Tan


0 Kudos
FvM
Honored Contributor I
206 Views

Hi,
I have a comment beyond LPM_divide maximum speed question. I'm frequently implementing RMS measuring circuits, I wonder why you don't use a topology without divider. I have however difficulties to relate your RTL schematic to a usual RMS scheme which is either fixed integration window over n periods of signal fundamental or continuous recursive processing with low pass (equivalent of analog RMS detector circuit). In case of discontinuos processing, division is only performed once per window period or substituted by a multiply.

0 Kudos
Reply