Re: Speed grades and internal timings

Altera_Forum · ‎09-25-2008

Hi,

Not so much a problem that I´m having just more something I would like to understand better.

I am currently using a Stratix II GX I-Grade -4 FPGA for a project. I am using Modelsim to carry out timing simulations on my design and am wondering how the .vho file quartus produces specifies the timing delays in the FPGA.

For example if I have a combinational block of logic in the design, does the .vho file always specify, the slowest possible time it could take for a signal to propagate through for a given speed grade?

For example in my -4 (mid speed grade device), would the results of the timing simulation specify a delay on an FPGA which is almost a -5 i.e. a slow -4, or a typical -4 which is centered on the -4 distribution? Or is there a setting somewhere to specify this in Quartus?

Likewise if a device is almost a -3, lets say there is one parameter that downbins it to a -4, would quartus produce the slowest possible timings in the .vho file?

Also just one further question:

Does the -3, -4 or -5 have any real meaning or is it just an arbitrary number assigned to specify the different speed grades. I read somewhere that for example -3 would mean a 3ns delay from point x to point y, but I´m not sure, perhaps it has a meaning to the design engineers regarding internals timings, but to the average user it´s just a number not required to be understood. Could somebody please clarify this.

Thanks you very much for your time

Ardni

Altera_Forum · ‎09-25-2008

The timing delays are written to the accompanying .sdf files. (Standard Delay Format, I believe), and I think you'll get a slow model, which is based on your selected speed grade, and a fast model.

The slow model is the worst case analysis for every path in the design. The fastest is, well, the fastest. There is only one fast model, since FPGAs are not binned against this, and therefore any speed grade device could potentially be "fast". Note that these models are the same as the timing models used during static timing analysis.

The numbers used to mean something with CPLDs, specifically they represented the Tpd for any two pins, so they would be marked -100, -70, -50 or newer stuff would be -15, 10 and -7. For the FPGAs they don't mean anything except relative to the others(smaller is faster.) Of course, there is no standard, as X flipped their numbering a while back, and for them the higher number is faster.

Finally, just as an FYI, I see fewer and fewer people do timing simulations. Instead they do RTL simulations(much faster) and static timing analysis(more accurate from a timing perspective). That doesn't mean things won't be caught in a timing simulation, but the more synchronous design practices you follow, the less likely that is.

Altera_Forum · ‎09-26-2008

Thank you for your reply Rysc.

When using modelsim for the timing simulation I include the.sdo file. But does this file contain both the fast and slow model? or how can I switch between them? I assumed by default I would be simulating the worst case (slowest).

Generally it would be more useful to do a simulation using the slow model I am thinking, but what would be the advantages of simulating the fastest model? To investigate hold times etc? but I suppose they are presented in the static timing analyses?

I did read about quite a few people saying that they haven´t been doing timing simulations recently and too be honest it takes so much time to do them properly that I´m kinda glad to hear that it may be be totally nessesery. I have a work mate who swears by them, so I have also been doing them. I have spotted several things in the timing simulations that I didn´t see in a functional simulation or in the static timing analyses. Things like signals going undefined, when I change them from within in a state machine. Sometimes when I alter a signal in one state it does not work (i.e. goes undefined), but if I alter the code to change the signal in a different state,it works. Anyway I don´t understand well, but its something I had been meaning to ask here, but perhaps its for a different thread. Perhaps its something to do with my coding style.

Do you have any links to the 'synchronous design practices' that you mentioned in your last thread? I´d be interested to have a read.

Also I am still using the classic timing analyser in Quartus, but for my next project, I plan on switching to Timequest. Perhaps when you mentioned that its possible to get away without doing timing simulations you had in mind that I would be using timequest with the entire design contrained? Unforunately this is not the case, but from reading quite a bit on this forum recently, I see almost everyone is using Timequest, so hopefully in the near future, I can make the change too.

Thanks for your help

Altera_Forum · ‎09-26-2008

Altera have a document on coding style:

http://www.altera.com/literature/hb/qts/qts_qii51007.pdf?gsa_pos=1&wt.oss_r=1&wt.oss=coding%20style

Most of the FPGA manufacturers seem to produce a similar document and on the whole they pretty much agree.

You can select between best and worst case timing models in the simulator GUI. It's a while since I've done it myself but if memory serves me correctly then the simulators tend to give you a choice between min, typ and max timing constraints - for Altera min and max are the same (worst case) and typ is obviously typical. Perhaps somebody could correct me if this is wrong or out of date information.

(I usually find it takes to long and isn't necessary unless you find a problem - just do a really good RTL simulation with a comprehensive testbench and make sure that you apply timing constraints to the place and route - let Quartus do the work for you)

Altera_Forum · ‎09-26-2008

There is no typical, just min and max, which are two separate models(and significantly different). For coding styles, just try to minimize clock domains, don't gate clocks, be careful when transfering data between clock domains, and use the asynchronous set/reset for domain resetting only(i.e. don't use it like logic). I know that's brief, but it depends on your system and requirements on what can be done. (I've worked on some systems with ~100 clock domains, huge clock muxes in logic and all sorts of yucky stuff. They had to do all this for what the system did, and they did everything through RTL sims and timing analysis, so it's definitely possible.)

I would double check what your timing sim showed you compared to the RTL sim(and/or static timing analysis). Note that nothing goes "unkown" in a real device, besides metastability events which would show up in static timing analysis. So it's worth investigating what your timing sim is showing you and figuring out what it means in hardware.

Altera_Forum · ‎09-30-2008

Hi again, Many thanks for the reponses.

Just one or two final things I´m not 100% on:

How can i switch between the max and min models in a timing simulation in Modelsim. I´ve been looking at it for some time now, but I don´t see where I can choose.

Also, what are the advatages of simulating the max and min models. Is it ever nessesery to simulate with the min model? I know people have been put forward the argument that timing simulations are not nessesery and its something I hope not to do in my next project, but for the moment I´m curious to understand a little more about them.

Finally when Rysc mentioned 'don't gate clocks' and 'be careful when transfering data between clock domains' could someone (or Rysc if you are reading) please expand just a little. I know its probably for another thread but I´m aware that probably I have done some things that maybe I shouldn´t as regards my coding style. I am gating 1 clock in my design and also in many cases I am passing signals between clock domains.

Why should clocks not be gated? and what is the alternative?

What precautions should be taken when passing signals between clock domains? Is it enough to pass them through 2 flip flops?

I really appreciate the help you guys give, so thanks again.

Altera_Forum · ‎09-30-2008

Just to clarify, in my previous post I was talking about simulating in Modelsim (I've never used the Quartus simulator). Reading the bit of the Quartus manual which describes simulation in Modelsim:

--- Quote Start ---

You do not have to choose from the Delay list because the

Quartus II EDA Netlist Writer generates the SDO using the

same value for the triplet (minimum, typical, and

maximum timing values). The value is derived from either

the fast (minimum) timing model or worst case (maximum)

timing model, depending on which timing model was used

in the last timing analysis. In the standard compilation

flow, the Quartus II software writes the SDO using timing

values from the worst case (maximum) timing model.

--- Quote End ---

Like I said it's been a while since I've done this (I've usually found it to be not necessary) but if you're simulating in Modelsim then you should ensure that you select the timing model you want when you run the compilation in Quartus (back when I did this last you could still select max, min or typ in Modelsim).

If you gate a clock then you can muck up the timing analysis because you're adding in a delay on the clock reaching some registers relative to others.

Instead of using a gated clock use a clock enabled register:

process(clk)
begin
    if rising_dege(clk) then
        if clk_enable = '1' then
            -- put your code in here
        end if;
   end if;
end process;

There's some discusison of this in the Altera coding standard document.

Passing signals between clock domain depends on the application. If you've got one signal or several signals which are unrelated then you can simply do what you described with one or two synchonising registers. If for example you're passing a data bus across clock domains then you need something a bit more clever to ensure that the dta doesn't get corrupted (i.e. you don't want some of the bits appearing on the opposite side a clock cycle earlier or later than the others otherwise you'll end up with corrupted data). In this case you would need to do something clever with handshaking like a FIFO sort of structure - this takes several clock cycles to get the data across but prevents corruption - e.g. freeze the data in the old domain and synchronise the "frozen" signal in the new domain; latch the data into the new domain and acknowledge this back to the old domain; synchronise the acknowledgement in the old domain and then unfreeze the data.

This is obviously more complex so you need to think about what you want to do, but there's probably a megwizard thing to hekp you do it.

Good luck

Altera_Forum · ‎09-30-2008

I think timing simulations are most valuable for non-synchronous design. Everything that's synchronous should behave in the timing simulation in an identical manner to the RTL simulation.

One thing people often don't realize is that timing simulations don't fail when you miss an edge. For example, let's say you have data that you expect to be latched into a register 10ns after the data is launched, but for some reason that data is delayed by 2ns, so it actually gets there at 12ns. Your timing simulation will not issue any sort of warning. Instead, your data just gets latched in a clock cycle later. Only if your testbench is able to recognize this as an error will you see a problem, and it will probably show up as a functional error until you trace it back to the data getting there too late.

If you go to a fast timing model, you'll now simulate the other extreme, where the data gets there much faster and is captured by that clock. So now you have two simulations, one with the data getting there on time and one with it getting a cycle late.

But now let's say this is a bus, and the system doesn't really care what cycle it gets there on, as long as it all gets there together. Neither simulations would show an error, since one would have all the data arrive early, and the other shows it all arriving late, while in hardware you could have some of both, which is an error. This is the type of thing static timing analysis catches that timing sims often do not.

The one thing timing sims do catch is a violation of the uTsu and uTh of the register. So if a register captures at time 10ns and the data changes in that small uTsu/uTh window, the register goes unknown and you get a direct warning(I think). But this window is very small, and its usually more of luck that a timing simulation would catch this than due to a rigorous testbench.

That being said, I'm working on a phase-aligning circutry that muxes four phases of the clock together and will actually extend or shorten the capture clock so it matches the incoming data stream. It's a really complicated design from a static timing analysis perspective, and this is the type of thing where a timing simulation might show how it's supposed to work better than an RTL simulation(it's not my design).

As for gating clocks, use a clock enable or use the megafunction altclkctrl(which has a nice synchronous enable for turning on/off the clock) when possible. For going between clock domains, batfink nicely covered that. At the end of the day, the simplest solution is to a) make sure the clocks are synchronized(outputs of the same PLL), or b) use a DCFIFO, which treats them as completely asynchronous. Other solutions take some understanding of what can happen. As batfink said, if it's a single signal crossing domains and you don't care what cycle it arrives on, then just re-registering it a few times should be sufficient.

Altera_Forum · ‎09-30-2008

Rysc makes some good points there and the distinction between asynchronous designs and synchronous ones is important - I've not bothered with timing simulations because all of my designs in recent years have been deliberately synchronous. If you're doing asynchronous design, Ardni, then pay close attention to what Rysc is saying - his posts are really on the ball here!

Altera_Forum · ‎10-02-2008

Many thanks to both of you for the responses.

Just one more thing:

Perhaps I have misinterpreted the meaning of 'gated clock' but would using using the altclkctrl megafunction be considered clock gating?

I am using this megafunction in my current design to 'gate' a clock used to program a clock generater. The reason is to disable the clock once the generator is programmed after reset to save power. I saw in the timing simulation that the output of this buffer was skewed. I also recieved a warning in quratus saying :

"found 1 node in clock paths which may be acting as ripple and / or gated clocks --nodes analyzed as buffer(s) resulting in clock skew. "

Is it possible to use this megafunction and not recieve this warning? In my design, I am ignoring this warning (which I assume is safe to do) as it is only used to program the generator before being disabled.

My current design is synchronous, but thank you for the info on the DCFIFO, I hadn´t been aware of it before. After having a read in the handbook it seems as though it could be quite useful in some situations. My next project is to do some work on an old design which controls a VME Bus. From my initial research it looks horrible with the asynchrounous aspect, but certainly the information given here should help me out in the future with it.

Thanks again.

Altera_Forum · ‎10-02-2008

Looking at the help for this function:

--- Quote Start ---

Notes: When the altclkctrl megafunction is used to select between two or more clocks, the Timing Analyzer issues a gated clock warning message. The Quartus II software takes into account the additional clock skew during timing analysis.

--- Quote End ---

...technically I guess is a gated clock but done in such a way that Quartus can correct for it. There may be some specialised silicon to perform this function, which is well characterised. I think that the problem with gating a clock with any old gate, is the predictability of that gate's propagation delay and the routing delay associated with it. With this megafunction, those delays may be tied down much more tightly.

I'm guessing as I haven't used it before; but you could raise a question on the Altera Mysupport. Not sure if Rysc could offer any peals of wisdom.

Altera_Forum · ‎10-06-2008

--- Quote Start ---

Perhaps I have misinterpreted the meaning of 'gated clock' but would using using the altclkctrl megafunction be considered clock gating?

I am using this megafunction in my current design to 'gate' a clock used to program a clock generater. The reason is to disable the clock once the generator is programmed after reset to save power. I saw in the timing simulation that the output of this buffer was skewed. I also recieved a warning in quratus saying :

"found 1 node in clock paths which may be acting as ripple and / or gated clocks --nodes analyzed as buffer(s) resulting in clock skew. "

Is it possible to use this megafunction and not recieve this warning? In my design, I am ignoring this warning (which I assume is safe to do) as it is only used to program the generator before being disabled.

--- Quote End ---

While clock control blocks for on-off gating are much preferred over doing the gating in logic resources, some of the issues discussed at http://www.alteraforum.com/forum/showthread.php?t=2388 will still apply. You can eliminate the Classic Timing Analyzer warning by assigning a derived-clock setting to the output of the clock control block, but that does not eliminate those considerations.