Can't close timing on one clock domain

Altera_Forum · ‎02-17-2017

Hi,

I've been struggling to close timing on one clock domain in my design, happens to be the one with all my control logic/ nios on it so is kind of important for getting into functional test.

After reducing the system down to only the Altera IP, the problem remains, so I feel like something is fundamentally wrong with my .sdc, but I'm stumped as to what.

Brief System Outline: Arria V GZ, design entirely in Qsys, main IP blocks are : JESD204B Receiver, NIOS II, DDR3 Controller.

Two input clocks, 135MHz for JESD204B devclk, and 100MHz crystal for everything else.

100MHz clock drives PLL for DDR3 interface, one of the PLL outputs at 200MHz is used as the system clock (pll_afi_clk).

I cannot get "pll_afi_clk" to meet setup for transfers on the same domain, and a huge number of paths are failing (see the histogram I've attached). All other clocks, both base and derived, are fine.

There are a number of IP-generated sdc files, and my own .sdc file which does the basic stuff, runs "create_clock" and "derive_pll_clocks", "derive_clock_uncertainty" and then "set_clock_groups" for all the clocks generated from the PLLs. Using the TQ messages when the SDC's are read in, everything seems to have the desired outcome. Made sure to place my sdc file at the end of the list, so the IP files can get their constraints in without conflict.

I've used the guidelines in http://www.alteraforum.com/alterawiki.com/uploads/3/3f/timequest_user_guide.pdf and scoured the thing for any ideas, but have not come up with anything yet. (super useful document btw.)

I'm confused because according the TQ reports, all the clocks are defined as I would expect. The clocks domains have false paths set between them as I would expect, and the IO constraints aren't relevant here as this clock is internal only. And no matter how much I push the P&R engine, it makes little difference to the result. Just a large number of paths that fail setup, and the Timing Recommendation Report always says "Too much combinatorial" in the worst failing paths. Sometime the IC delay is also very large. And the worst failing paths change depending on the P&R seed, from resets to Avalon Interconnect to other glue logic stuff. If I probe the source or destination nodes in TQ, no placement constraints are reported.

Device utilization with this minimum working example is pretty small, 4% routing and 15% resource. So it shouldn't be struggling.

I'd really appreciate some input on better understanding the information TQ is giving me. It would seem the fitter is conflicted over how to lay out logic on this clock domain, or that my constraints are wrong and the tool is analyzing the paths incorrectly, but I don't know how to dissect the problem further.

Thanks,

Altera_Forum · ‎02-17-2017

Looks to me like you have legit timing problems, as in your logic just can't run at 200MHz. You're at or well beyond the fmax for Nios II on Cyclone V depending on which Nios flavor you're running. See Table 2 in this document:

https://www.altera.com/en_us/pdfs/literature/ds/ds_nios2_perf.pdf

Altera_Forum · ‎02-18-2017

Thanks for the reply,

I'm actually running on Arria V GZ, so either Nios II flavour should be fine at 200MHz.

It makes me wonder however, if only part of the logic on the domain was unrouteable for 200MHz, would the I see such massive failure as per the histogram, or would all the logic P&R correctly except the incompatible stuff?

For example if I had a fully constrained placement at 200MHz, and I added a small IP component only good for 150MHz, what would the histogram look like?

Altera_Forum · ‎02-20-2017

One big issue I see is that your report indicates that a burst adapter is getting added into your Qsys system design which adds a large amount of combinatorial delay into the interconnect. There's a number of possible reasons for this.

Data path (master / slave) width mismatch

Either master or slave can use a burst and have a burst count field on Avalon interface

When master is bursting to a slave and there’s a data width mismatch

Matching data path width but different burst counts

Burst count fields in the master and slave may be different widths

Master with narrower burst count width than slave: not usually a problem

Slave has narrower burst count width than master: adapter needed to split burst into smaller bursts; should be resolved if possible

Burst type not supported by slave

Example: AXI master bursting to an Avalon slave

Solutions: match burst widths if possible or use a pipeline bridge component, matching the burst widths on either side of the bridge. So if you have a burst width of 8 from a master and slaves only support a burst of 1, add a pipeline bridge between the master and slaves with a width of 8 on its slave side and a width of 1 on its master side.

You can also add pipeline stages inside the interconnect on the Interconnect Requirements window in Qsys or manually in the Show System with Qsys Interconnect option from the Qsys System menu.

Altera_Forum · ‎02-21-2017

That comment to add pipeline stages was a 1 click fix to my problems, it pushed my fMax up to 216MHz with no other changes! Increased the system-wide max no. of pipeline stages from 1 to 4.

If only I knew about it before (it does hide slightly in the Qsys GUI tbh).

Very content with this fix as the overall latency is not important for my application.

Thanks very much!

Altera_Forum · ‎02-21-2017

Glad it works! See chapter 7 in volume 1 of the Quartus Prime handbook for all sorts of techniques for optimizing Qsys system designs.

Altera_Forum · ‎02-21-2017

Great call, sstrell!