Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16694 Discussions

Huge interconnect delays on Arria10 input paths

FlorianHMueller
834 Views

Hello,

we are currently doing trial synthesis for switching to Arria 10 devices from Stratix 5. We assumed that we would be able to meet timing during this switch. Mostly this is true, but regarding the IO timing paths we encounter large delays on Arria 10 which were not present on Stratix 5 devices.

Please refer to the following result from timing analysis:

timing_path_with_large_delay.PNG

As you can see, there is a huge delay caused by interconnect between the IOIBUF and the LABCELL. At the same place, in Stratix 5, a delay of only 7 ns was encountered, which was small enough to not cause an issue for timing closure.

We have already set

set_global_assignment -name PROGRAMMABLE_POWER_TECHNOLOGY_SETTING "FORCE ALL TILES WITH FAILING TIMING PATHS TO HIGH SPEED"

to make sure that there are no low power tiles present in the critical path. We have also tried synthesis for speed grade of 1 with OPTIMIZATION_MODE set to "AGGRESSIVE PERFORMANCE". But the problem persists. The transition from IO buffer to internal interconnect causes one single large delay which makes timing closure impossible.

A sidenote: We originally used tool version 15.1 (because originally that was our only licensed version) and in this version we got warnings that timing data was preliminary for Arria 10 devices. However, in this version 15.1 this IOBUF to interconnect delay was significantly smaller, similar to Stratix 5. When we subsequently switched to version 22.1 (also licensed; to avoid the warning about preliminary timing data for Arria 10) we encountered this large delay.

We now wonder how we can avoid this very large delay. If we can not avoid this, then Arria 10 devices will not be suitable for our next hardware platform.

We already checked that the latch clock is on a global clock network, ran the fitter with and without IO location constraints, tried different seeds. But unfortunately we can not change the design (we are FPGA prototyping ASIC devices) so we can not move to registered IO.

Is there something else we could try? Or is this kind of delay inherent in Arria 10 devices and unavoidable?

Thanks a lot in advance for any help/insight you can provide into this problem.

Florian

 

0 Kudos
5 Replies
FlorianHMueller
828 Views

I would like to add one really interesting additional observation:

When we set

set_instance_assignment -name VIRTUAL_PIN ON -to <all_io_with_timing_trouble>

then we would have expected the timing issues to go away. But the timing issues persist. Please refer to the following images showing the timing analysis and chip planner screenshots of the now internal paths which formerly were input paths:

virtual_timing_path_with_large_delay.PNG

virtual_timing_path_with_large_delay_cp.PNG

As can be seen, the huge delays are still present, but now on ridiculously short internal paths. This now really does not make sense.

Is there anything we could try to find out why the timing analyzier and chip planner think that this path has such a huge delay?

Again, thanks a lot in advance for any insight you could provide!

Florian

0 Kudos
sstrell
Honored Contributor III
813 Views

Very odd.  What do your timing constraints look like?  Perhaps something in the SDC is causing the issue though the issue with the virtual pins doesn't point to that.

Can you generate a timing report with the "show routing" option turned on?  That breaks down the wire connection into its individual elements.  If you have no other placement assignments, the route must be snaking all over the place adding to the delay, but there's no explanation as to why that may be happening.  The show routing option might help figure it out.

What does the hold timing look like on the same path?  Perhaps you have optimize hold timing turned on which causes the fitter to intentionally add additional routing on the path to help improve hold timing requirements (though this seems super excessive for that feature).  Try turning that off globally or with an assignment.

FYI: I don't think power tiles are thing in A10, so the assignment you mention doesn't do anything.

FlorianHMueller
639 Views

Thanks a lot for this answer, it finally led me to the answer:

  • Fitting with deactivated hold time fixing did not show the setup time problem on these IO paths
  • Show routing option was indeed helpful to see the long convoluted paththat existed behind the short virtual IO path

So basically there was the intention to constrain the IO with set_min_delay 0 and set_max_delay 10. The intent was to have a path from IO to the registers which was at most 10ns long. But that is not what these set_min_delay and set_max_delay constraints are doing. They are instead added to the clock insertion delay and the actual delay has then to be within <clockinsertiondelay>+0ns and <clockinsertiondelay>+10ns. In the past, when fitting for stratix 5 this seems to not have been a problem. But for Arria 10 the fitter was unable to meet both hold and setup requirements of the beforementioned constraints. It seems to fix the hold part by inserting a massive large long routing path, but then is unable to fix the setup part and gives up, the STA reporting the huge long path and setup violation.

If you do report hold timing for the same path in that case you get a 5ns positive slack, while the setup violation is only like -3ns. I was confused by this, but it seems that the fitter is most likely considering on chip variation and calculating worst clock path against best data path (for hold) and best clock path against worst data path (for setup), so the hold slack can not be directly used for fixing the setup violation.

Anyway, by replacing the set_min_delay and set_max_delay with set_net_delay constraints (which actually match the intent of the original constraints) for the IOs the problem is finally solved and we can proceed looking for the right FPGA for our next project.

Thank you very much for your answer, sstrell! Learning both that there is an option to globally deactivate hold fixing and to show the actually routed path was very helpful

The remaining problem is that set_net_delay causes the IOs to be still reported as unconstrained. But we will discuss on our side here how to constrain these paths and not have them reported as unconstrained.

 

0 Kudos
sstrell
Honored Contributor III
634 Views
I should have asked to see your .sdc. As for the unconstrained I/O, you should have set_input_delay or set_output_delay on all of them to fully constrain the design. Asynchronous I/O should have set_false_path timing exceptions.
0 Kudos
ShengN_Intel
Employee
602 Views

I’m glad that your question has been addressed, I now transition this thread to community support. If you have a new question, Feel free to open a new thread or login to ‘https://supporttickets.intel.com’, view details of the desire request, and post a feed/response within the next 15 days. After 15 days, this thread will be transitioned to community support. The community users will be able to help you on your follow-up questions.

 

0 Kudos
Reply