Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16614 Discussions

Clock network delay + internal cell delay > Minimum timing requirement

Altera_Forum
Honored Contributor II
4,001 Views

Hi, 

 

I have a SDRAM controller running at 133MHZ which is currently violating the setup requirements of the RAM. I have a register connected to an I/O pin which tri-states the data bus when the FPGA OE is un-asserted. The timings for the register are: 

 

Clock delay (2.7) + oe signal delay (0.2) + register delay (3.3) + RAM setup time (1.5) = 7.7ns 

 

This results in a violation of 0.2NS. I'm not sure what to do as the delay is completely within one cell, there are no other routing delays except for the global clock. 

 

I have taken the following measures to avoid this problem: 

- Ensured the clock of the tri-stating register is global 

- Enabled "Fast Output Register" 

- Enabled "Fast Output Enable Register" 

- Enabled "Speed Optimization Technique for Clock Domains" 

- Registered the databus signal with the system clock 

- Fitter effort set to 3 

 

The Cyclone II i'm using is about half full and i'm just about out of ideas.  

 

Any advice would be greatly appreciated! 

 

Evan
0 Kudos
26 Replies
Altera_Forum
Honored Contributor II
270 Views

Hi Rysc, 

 

Thanks for the advice, I'll run the fast model analysis as well. 

 

I don't think i can take it off global as it fans out to over 500 locations. The only thing i could think would be good if I could use a localized clock for my output registers and the global clock for all other logic. I don't know if this is possible though without the synthesizer removing it. I do happen to have another clock input free, i wonder if it would make any sense to input the system clock twice, once for logic and another for registering the outputs. I could then specify this second clock input as local, which may reduce the clock network delay. I'll give it a go. Thanks!
0 Kudos
Altera_Forum
Honored Contributor II
270 Views

 

--- Quote Start ---  

The only thing i could think would be good if I could use a localized clock for my output registers and the global clock for all other logic. 

--- Quote End ---  

 

 

 

I don't ever use this capability, but the "Global Signal" assignment in the Assignment Editor supports point-to-point settings (both -from and -to fields used) so that you can control global usage for individual destinations. It worked for my QII 7.2 test case in point_to_point_global.zip. 

 

In the test case, clk_in goes to 6 registers including 2 output registers in the I/O cell. With my "Global Signal" assignments, the clock had 4 global destinations for the internal registers and 3 nonglobal destinations for the 2 I/O cell registers plus the input to the global buffer. 

 

 

generic_test_case.qsf has this: 

 

 

--- Quote Start ---  

set_instance_assignment -name GLOBAL_SIGNAL "GLOBAL CLOCK" -to clk_in 

set_instance_assignment -name GLOBAL_SIGNAL OFF -from clk_in -to dff_out* 

--- Quote End ---  

 

 

generic_test_case.fit.rpt has this: 

 

 

--- Quote Start ---  

+-----------------------------------------------------------------------------------------...+ 

; Control Signals ...; 

+--------+----------+---------+-------+--------+----------------------+------------------+...+ 

; Name ; Location ; Fan-Out ; Usage ; Global ; Global Resource Used ; Global Line Name ;...; 

+--------+----------+---------+-------+--------+----------------------+------------------+...+ 

; clk_in ; PIN_H2 ; 3 ; Clock ; no ; -- ; -- ;...; 

; clk_in ; PIN_H2 ; 4 ; Clock ; yes ; Global Clock ; GCLK2 ;...; 

+--------+----------+---------+-------+--------+----------------------+------------------+...+ 

 

 

+------------------------------------------------------------------------...+ 

; Global & Other Fast Signals ...; 

+--------+----------+---------+----------------------+------------------+...+ 

; Name ; Location ; Fan-Out ; Global Resource Used ; Global Line Name ;...; 

+--------+----------+---------+----------------------+------------------+...+ 

; clk_in ; PIN_H2 ; 4 ; Global Clock ; GCLK2 ;...; 

+--------+----------+---------+----------------------+------------------+...+ 

 

 

+--------------------------------------------------------+ 

; Non-Global High Fan-Out Signals ; 

+----------------------------------------------+---------+ 

; Name ; Fan-Out ; 

+----------------------------------------------+---------+ 

; clk_in ; 2 ; 

. . . 

+----------------------------------------------+---------+ 

 

. . . 

 

+-----------------+ 

; Fitter Messages ; 

+-----------------+ 

. . . 

Info: Promoted node clk_in (placed in PIN H2 (CLK0, LVDSCLK0p, Input)) 

Info: Promoted destinations to use location or clock signal Global Clock CLKCTRL_G2 

Info: Following destination nodes may be non-global or may not use global or regional clocks 

Info: Destination node dff_out1 

Info: Destination node dff_out0 

--- Quote End ---  

 

 

The on-line help documenting this use of the "Global Signal" assignment: 

 

 

--- Quote Start ---  

Global Signal logic option  

 

-------------------------------------------------------------------------------- 

This option can be set in the Assignment Editor. 

 

 

A logic option that specifies whether the signal should be available throughout the device on the global routing paths. Global signals can be both pin- and logic-driven. Clock, output enable, register control, and memory control signals can be global signals. Turning on this option for a pin or a single-output logic function signal is equivalent to feeding the signal through a GLOBAL buffer. Turning off this option for a particular signal prevents any of the Auto Global options from using the signal as an automatic global signal. 

 

This option can be set in the Assignment Editor. This option is available for all Altera devices supported by the Quartus II software. 

 

You can select one of the following settings for these supported device (Arria GX, Cyclone, Cyclone II, Cyclone III, HardCopy II, MAX II, Stratix, Stratix II, Stratix II GX, Stratix III, and Stratix GX) families. 

 

Priority Level: 1 

Assignment Type/Location: Point-to-point assignment from source to destination (register or memory that is the intended global path). 

Affected Path(s): Includes the path defined by source and destination. 

 

Priority Level: 2 

Assignment Type/Location: Single point assignment to any node. 

Affected Path(s): Includes all fan-outs of the specified node. 

 

 

For these supported device (Arria GX, HardCopy II, Stratix, Stratix II, Stratix II GX, Stratix III, and Stratix GX) families, a setting of on is equivalent to "global clock". Any "fast regional clock" and "dual-fast regional clock" clock assignments assigned to HardCopy II or Stratix II devices are converted to "regional clock" and "dual-regional clock", respectively. 

--- Quote End ---  

0 Kudos
Altera_Forum
Honored Contributor II
270 Views

The project you added didn't compile(and looked like it would take a number of changes to weed through everything.) I did notice that the databus you're concerned with is bidirectional, so there should be two Tco requirements, one through the OE register and one through the data register(I'm not positive, but am assuming). The OE path will usually be slower because it takes longer for the driver to "turn around". That being said, many interfaces don't need to turnaround in one clock cycle. So the data would need to meet the 7.5ns period, but the OE path does not. Just another possibility. 

But yes, before PLLs, it was relatively common that users chose faster speed grades to achieve IO performance rather than faster internal performance. (CPLD speed grades used to be based entirely on pin-to-pin delays, i.e. the speed grades would be -30, -45, etc., meaning a 30ns Tpd or 45ns Tpd.) So you might reach a point where the faster speed grade is the only solution.
0 Kudos
Altera_Forum
Honored Contributor II
270 Views

In relation to Brads suggestion: 

 

Info: Promoted node sclk_p (placed in PIN H2 (CLK0, LVDSCLK0p, Input)) 

Info: Promoted destinations to use location or clock signal Global Clock CLKCTRL_G0 

Info: Following destination nodes may be non-global or may not use global or regional clocks 

Info: Destination node md_s[0] 

Info: Destination node md_s[1] 

Info: Destination node md_s[2] 

Info: Destination node md_s[3] 

Info: Destination node md_s[4] 

Info: Destination node md_s[5] 

Info: Destination node md_s[6] 

Info: Destination node md_s[7] 

Info: Destination node md_s[8] 

Info: Destination node md_s[9] 

Info: Non-global destination nodes limited to 10 nodes 

 

Seems as though this approach may not work as its limited to the number of nodes. I connected the clock (by wire) to another clock input pin (sclk_p2) and used it to clock the output I/O registers. I tried setting it to non-global signal, but the delay incurred was worse than for the global clock. So not sure if this approach will ever result in a saving simply because the address bus is quiet wide (32) and is spread out over the chip pins. 

 

In relation to Rysc's comment, timequest reports the setup time is breached for md_s -> md_p and oe_s -> md_p. So although i haven't explicitly given two tco requirements, timequest has split the tri-state output into the separate paths. So I guess i can ignore the oe_s > md_p violations since as you pointed out the bus is not required to turn around in one cycle. 

 

P.s. Sorry i wasn't sure what sort of information you were after, so it was basically just the top level of my project where the registered data bus I/O was.
0 Kudos
Altera_Forum
Honored Contributor II
270 Views

 

--- Quote Start ---  

Info: Non-global destination nodes limited to 10 nodes 

 

Seems as though this approach may not work as its limited to the number of nodes. I connected the clock (by wire) to another clock input pin (sclk_p2) and used it to clock the output I/O registers. I tried setting it to non-global signal, but the delay incurred was worse than for the global clock. So not sure if this approach will ever result in a saving... 

--- Quote End ---  

 

 

 

The second part of what I quoted sounds like you think the mix of global and nonglobal routing won't solve the timing problem, but my suggestion for controlling the routing probably did make it nonglobal for as many destinations as you intended. 

 

I suspected the "limited to 10 nodes" was just a reporting limit. I verified that with my test case and in on-line help. 

 

I changed my test case to 12 output pins (replicated the entire little circuit from input pin to output pin to create 12 copies). I got the "limited to 10 nodes" message. I got 12 nonglobal destinations for the I/O cell registers (13 in the table where the input to the global buffer is also counted) and 24 global destinations for the internal registers. You need to check your Fitter report tables like the ones I showed before to see how many destinations were made nonglobal. 

 

 

The on-line help for the "limited to 10 nodes" message (right click messages to get to their help): 

 

 

--- Quote Start ---  

Non-global destination nodes limited to <number> nodes 

 

-------------------------------------------------------------------------------- 

CAUSE: The Fitter reported the nodes which may not have been promoted to be global or use global or regional clock networks in the design. the fitter is limiting the number of reported nodes to the specified amount. Refer to the message(s) that precede this message in the Messages window or in the Messages section of the Report Window for more information.  

ACTION: No action is required. 

--- Quote End ---  

0 Kudos
Altera_Forum
Honored Contributor II
270 Views

Hi Brad, 

 

Yes your right, it did in fact route the signal non-globally to the 32 output registers. However the delay incurred was still considerably worse than for the global clock network. 

 

I think we've pretty much reached the end of line, would you agree? Though i didn't fix my problem I've learnt a lot about the internal routing and clocking schemes. Thanks for the all the feedback!
0 Kudos
Reply