Re: Expected ALTPLL behaviour

Altera_Forum · ‎08-12-2008

Hello all,

I would welcome some clarification on the expected behaviour of an ALTPLL when operating with a single compensated output clock in NORMAL mode. From the Stratix II GX Device Handbook Vol 2 page 7-23 I understand that in NORMAL mode the clock at the input pin of the device and clock at the target data register should be aligned - in other words, any delay on the input clock path (all the way from the input clock pin) is removed.

However, on my current design (8ns period, EP2SGX90EF1152C5N device), I have the following delays in the clock path.

Input pin (AN19) to the PLL input : 4.130ns.

PLL output to a CLKCTRL block input : 1.303ns.

CLKCTRL block output to the target I/O register : 1.524ns.

So a grand total of 6.957ns. A list_path report (setup) for the register of interest is saying that the offset between the PLL input and output is -3.082ns.

I'm totally confused by this figure, and uncertain as to what is actually being compensated for here - any help/clarification would be appreciated.

Declan.

Altera_Forum · ‎08-13-2008

Why is your input pin to PLL 4.130ns? I'm guessing you're not using the PLL next to the dedicated clock pin, which will cause a long route that can't be compensated for(I think there's a warning amongst the thousand other warnings.) In simplistic tems, a PLL has two inputs, your clock and feedback clock, and will phase shift it's output until the feedback clock aligns with the input clock. Also, the output goes on a balanced, low-skew clock tree, and tries to hit all the destination registers at the same time. On one of those equally timed branches of the tree though, rather than feed a register's clock, it feeds the feedback port of the PLL.

Now, that's the simplistic version. Besides changing the feedback path, I think there's some stuff that do to help get the other modes(like source-sync mode). But a mode like that won't be too far off from normal mode.

Anyway, when you do a report_timing -detail full_path, you should see good details, but that path from the input pin to a PLL that's far away can't be compensated for, and that's why your offset doesn't equal the full clock delay. (Even when it does feed the right PLL, they don't add up to 0 or anything like that, but they get a lot closer.)

Altera_Forum · ‎08-13-2008

Rysc,

Many thanks for your reply - very much appreciated.

As far as I can see, the correct dedicated PLL input is being used here. The target device is a Stratix II GX: EP2SGX90EF1152. The PLL is created in NORMAL mode, and connected to pin AN19 (clk4p) of the device. Under Fitter/Resource Section/PLL Summary the tool is reporting that the PLL_Location is "PLL12" and the Inclk0 signal type is "Dedicated Pin". There are no relevant warnings/critical warnings in the output**.

From the Stratix II GX device handbook, Vol.2, Page 7-13, Table 7-7, it seems that the chosen clock input is a valid dedicated input for PLL_12. So I'm still at a loss as to why the input does not seem to be properly compensated. A chip planner screenshot (which I will try to attach) suggests the pin and PLL are indeed adjacent, but reports the big delay.

Ultimately I'm also interested in source synchronous operation - in fact what I'm trying to do is a static timing analysis of a memory interface (for which I think source synchronous operation may be a better mode). I see an identical delay from the input pin when I try the same design with a source synchronous PLL - and the compensation figures, though different, do not tie up any better with the expected behaviour from the datasheet.

Given that I'm trying to be precise in analysing the timing uncertainties of the interface concerned, I really need to be sure that the delays I see in Quartus are correct. If the basic mode of operation of this PLL does not tie up with the datasheet (at least to within a believable delta, if not 0, like you say) I'm reluctant to conclude that everything is believable.

Cheers,

D.

**One warning which I DO see is the following:

Warning: Clock latency analysis for PLL offsets is supported for the current device family, but is not enabled

....However, enabling clock latency analysis in the settings (whilst it gets rid of this warning) does not change any of the reported numbers.

Altera_Forum · ‎08-13-2008

Do a report_timing -detail full_path where this clock domain is the source and destination clock. Have it output to a .txt file and then .zip it up and attach. It really doesn't make sense and just wanted to take a look. I usually do the Chip Editor approach to see if the PLL is next to the dedicated pin, which you've done, so not sure what's going on. (Chip Editor visualization is a lot easier than digging through the handbook tables...)

Altera_Forum · ‎08-13-2008

Rysc,

I'm having some difficulty getting the report_timing command to work (just syntax finger-trouble I suspect, I've not used this command much), but I've used list_path instead to generate a setup report for the DDR I/O registers clocked by the PLL....this breaks down the clock path, and reports the pin to input register delays, and the PLL offsets. Command syntax was......

list_path -npaths 30 -file path_report.txt -from qdr1_q* -to controller:qdr_cont_1\|alt_ddio_in_s2gx:data_split\|altddio_in:altddio_in_component\|ddio_in_3be:auto_generated* -clock_filter feedback_clk_in -tsu

Does this give you the info you wanted to see? If not, I'll keep playing around with report_timing until it works. Here's a snapshot from a single path in case there are any problems with the attachment upload.

----------------------------------------------------------------------------------------------

Path Number: 1

tsu for register "controller:qdr_cont_1|alt_ddio_in_s2gx:data_split|altddio_in:altddio_in_component|ddio_in_3be:auto_generated|dataout_h[26]" (data pin = "qdr1_q[26]", clock pin = "feedback_clk_in") is 1.860 ns

----------------------------------------------------------------------------------------------

+ Longest pin to register delay is 1.477 ns

1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. = PIN_V23; Fanout = 2; PIN Node = 'qdr1_q[26]'

2: + IC(0.000 ns) + CELL(1.477 ns) = 1.477 ns; Loc. = IOC_X0_Y30_N1; Fanout = 1; REG Node = 'controller:qdr_cont_1|alt_ddio_in_s2gx:data_split|altddio_in:altddio_in_component|ddio_in_3be:auto_generated|dataout_h[26]'

Total cell delay = 1.477 ns ( 100.00 % )

+ Micro setup delay of destination is 0.122 ns

- Offset between input clock "feedback_clk_in" and output clock "test_pll_3:feedbk_clk_gen|altpll:altpll_component|_clk0" is -3.082 ns

- Shortest clock path from clock "test_pll_3:feedbk_clk_gen|altpll:altpll_component|_clk0" to destination register is 2.821 ns

1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. = PLL_12; Fanout = 1; CLK Node = 'test_pll_3:feedbk_clk_gen|altpll:altpll_component|_clk0'

2: + IC(1.303 ns) + CELL(0.000 ns) = 1.303 ns; Loc. = CLKCTRL_G5; Fanout = 204; COMB Node = 'test_pll_3:feedbk_clk_gen|altpll:altpll_component|_clk0~clkctrl'

3: + IC(1.339 ns) + CELL(0.179 ns) = 2.821 ns; Loc. = IOC_X0_Y30_N1; Fanout = 1; REG Node = 'controller:qdr_cont_1|alt_ddio_in_s2gx:data_split|altddio_in:altddio_in_component|ddio_in_3be:auto_generated|dataout_h[26]'

Total cell delay = 0.179 ns ( 6.35 % )

Total interconnect delay = 2.642 ns ( 93.65 % )

----------------------------------------------------------------------------------------------

Interestingly, the report makes NO mention of the input delay from the clock input pin to the PLL - just reports the clock path from the PLL output to the destination I/O register.

Cheers,

Declan.

Altera_Forum · ‎08-13-2008

I would STRONGLY recommend using TimeQuest. That's what the report_timing command is used for. There's a learning curve, but once you ramp up you're going to like it 100x better. I don't know where you're at, and if you love the Classic Timing Analyzer and have been working for months to make constraints for it, then it might not be worth it, but if you're not too far along, and possibly if you are, I would recommend making the leap.

Is feedback_clk_in the name of the clock port, or are you using an external feedback for the PLL? I don't think so, but wanted to check based on the name. Anyway, your clock delay appears to be 2.821ns and your PLL is doing a -3.082ns shift, so it seems to be pretty close. Again, it won't be 0ns difference, but it's not in the ns range either.

Altera_Forum · ‎08-13-2008

I have been looking at TimeQuest over the last few weeks - mainly because some of the Altera guidelines on constraining source synchronous interfaces are written specifically for it......could yet make the switch.

To answer your question, yes, feedback_clk_in is the name of the clock port. I'm not using external feedback for the PLL, just creating it in NORMAL mode, which should nominally compensate for delays on the input clock path.

But, it seems, the PLL does not compensate for the delay I see between the device input pin and the PLL input port. So I guess my next question is, what is the nature of that delay? I guess there are two reasons for using a PLL on this path. 1. To deskew/remove the input clock delays and reduce uncertainty in the static timing analysis for that part of the interface. 2. To potentially add a phase shift for ease of resynchronisation.

I'm primarily concerned with static timing uncertainties at the moment - I have noticed that the input delay (from pin to PLL input) remains totally fixed at 4.13ns for this path regardless of whether I run "fast corner" or "slow corner" timing - so does it seem reasonable to conclude that there is minimal uncertainty (over PVT) with regards to this delay i.e. I need build no margin into the static timing to account for it? That conclusion makes me a bit nervous, but maybe it's the correct assumption......?

Altera_Forum · ‎08-13-2008

Note that you can't really do a source synchronous interface with TAN. (Though many people have, but you're basically writing down the clock's Tco and then comparing the other Tcos to it.) TimeQuest allows you to constrain data outputs in relationship to clock outputs.

I haven't looked specifically at input port -> PLL input delays over various models. Technically, I think they should change. That being said, the overall clock delay shouldn't change much, as that's one of the main features of a PLL, in that it's PVT invariant, i.e. as your global clock trees vary over PVT, the PLL compensates by the inverse amount, making the total delay relatively constant. Also, if it's the source synchronous side, the clock delay should pretty much cancel out since that delay feeds both the data and clock outputs. I have a document I wrote on constraining source synchronous interfaces, if you're interested(if you have a good handle, it's probably not worth wading through another one...)

Altera_Forum · ‎08-13-2008

If your user-entered phase shift for the PLL is not zero, then even with the Classic Timing Analyzer you can see a better breakdown of the PLL timing numbers in the list-paths details. (I agree with Rysc's recommendation to use TimeQuest.)

When you turned on "Enable Clock Latency" in the "More Timing Settings" dialog box, you said that did "not change any of the reported numbers." You showed list-paths details for tsu. This setting doesn't help the list-paths reports for I/O tsu/th/tco/min tco. For register-to-register paths, however, it will separate the PLL compensation delay from the user-entered phase shift. That makes it easier to make sense of what the PLL is doing. Even if it is only I/O paths that you care about, you can look at a register-to-register path (if there is one) using the same PLL clock output to see the PLL compensation delay listed as a separate number. If you have register-to-register paths only on a different output of the PLL, you can still use one of those paths to see the same compensation delay. Just keep in mind that the line for the user-entered phase shift will be for that other PLL output.

If you want to try TimeQuest quickly for just reporting purposes (not to use during compilation where it would need to be completely set up correctly), then the automatic QSF2SDC conversion might be good enough. I/O timing constraints and some GXB clocks might not convert well, but you'll be able to see how TimeQuest reports the numbers for ordinary PLLs. The constraints needed for correct analysis and reporting of internal register-to-register paths on non-GXB clocks will probably convert fine.

Altera_Forum · ‎08-14-2008

Rysc, Brad,

Many thanks again for your input here -

Rysc.....

I take your point about the PVT invariance of the PLL - it's exactly this behaviour that I want....except that, in this case, there appears to be a portion of the input delay which is not compensated. i.e. rather than align the input PIN of the device with the clock as seen at the destination register, "NORMAL" mode appears to align the input PORT of the PLL with the clock as seen at the destination register. So I really need to understand the nature of the 4.13ns pin --> PLL input delay. Since this is nominally dedicated PLL routing, perhaps the uncertainty is negligible. Certainly, when comparing "fast corner" and "slow corner" timing, it's the only figure which doesn't change.

I'd be very interested to read your document on source synchronous constraints, if you're happy to share it.

Brad.....

Good idea to analyse the register-to-register paths. Unfortunately, there aren't any - the PLL clocks some DDR input registers, and the first register-to-register path is cross-domain, back into an independent system clock. That said, the phase shift I'm using (for the purposes of trying to understand the PLL behaviour better) is zero, which should make the numbers straightforward to interpret.

Ultimately, I think it boils down to whether or not you believe (or rather how you interpret) the figure I've attached from the device handbook. This shows the "PLL Reference Clock at the Input Pin" aligned with the "PLL Clock at the Register Clock Port". To me, that implies compensation of the COMPLETE clock path, from the input pin, to the target register - but I'm not seeing that reflected in the chip planner or in timing analysis figures.

Cheers.

file:///C:/DOCUME%7E1/Declan/LOCALS%7E1/Temp/moz-screenshot-2.jpg file:///C:/DOCUME%7E1/Declan/LOCALS%7E1/Temp/moz-screenshot-3.jpg

Altera_Forum · ‎08-14-2008

Where do you get 4.13ns from the path listed above? There should not be 4.13ns from the dedicated clock input port to the PLL input.

Also, as a hint, make sure you're looking at the big picture which, for I/O timing, is all based on clock ports and data ports. The delay from the clock port to the PLL input doesn't really matter by itself, but as a component of your clock port relationship to your data ports. What are you seeing for I/O timing across models? What is it you're trying to do with this number. In fact, the whole PLL compensation modes don't really matter without more info. If the input clock is perfectly aligned to the register clock, that doesn't matter if the delay to/from that internal register to the I/O port is 10ns. It's best to think of it at that system level(which you might be doing and have broken it down to the micro-parameter level), but I'm not sure what you're trying to do at the system level.

Altera_Forum · ‎08-14-2008

I don't see 4.13ns reported directly in the timing analyzer - but when I open chip planner, and look at the connection from the input pin to the PLL, the net is tagged with the 4.13ns delay. The other delays in chip planner tie up fine with the timing analysis reports.

I'm not concerned about absolute delays as such, just surprised, like you, about how big this one is considering the dedicated routing involved - makes me wonder if I'm doing something silly, or missing something obvious.

As for the wider context, I'm basically trying to analyse all the timing uncertainties associated with a QDRII memory interface. The transmit side is classically source synchronous - data sent to the memory with a clock phase shifted by 90deg. The receive side uses a feedback clock (FPGA output--> PCB -->FPGA input) to clock the read data from the memory into FPGA DDR input registers. The PLL in question is in the receive clock path, nominally to deskew the clock (an attempt to reduce timing uncertainties; works for Xilinx devices) and potentially to add a phase shift to improve setup/hold margins for resynchronisation.

Altera_Forum · ‎08-14-2008

Ahhh. I would ignore it, or file a Service Request. If it's not in the timing report, then the Chip Editor must be doing something wrong, or displaying something different than what you expect. The timing report is what really should be used.

You should get the same type of analysis as you get in Xilinx. Note that for source-synchronous you're actually going to miss some uncertainty. For example, at the slow corner you get a worst case data and clock Tco. But on a given piece of silicon, one of the paths might be "less worst case" than the other. You won't see that in SII, and didn't see it in Xilinx. (It should be small, but it's real).

For all 65nm families and beyond(SIII, CIII, etc.), and using TimeQuest, the models have this type of information. So at a corner, say the slow corner, the paths have a fast On-Die Variation nad slow On-Die Variation value. So for a source synchronous output, TQ will use the slow verion for the data and the fast version for the clock going off chip, and vice versa for the external hold analysis. Also, rise/fall times and unateness come into affect, as well as common clock path pessimism removal. This doesn't help you with your current design, but is a plug of things in the newer family models that TimeQuest takes advantage of.