Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
21610 Discussions

how to assign clock on a dedicated path

Altera_Forum
Honored Contributor II
2,456 Views

Hi all, 

 

I am experiencing clock skews of +3ns in the input to reg path and a skew of -3ns in reg to output path. 

I got an answer from altera forum only to route the clock through a dedicated pin commonly known as global clock buffer and global clock net. 

Now the question is how can i determine this particular net and pin on a family i am using. I checked the data sheet for stratix V and it doesnt have any pin named as global clock ..yes it does have [27:0] clkp and clkn and if i assign clk to these pins the results are unchanged and one more thing that the clk pin was assigned by quartus only so it assigned it to a 11clkp but the question of this global clock remains unanswered and i too have limited knowledge about it. 

I am attaching the report illustrating the clock skew and the clock path. 

 

Some information regarding the reports are that incase of +clock skew of around 3ns the slack was +ve of around 0.6 and incase of -ve clock skews the slack was -ve around -4.3. 

I am working at 225 Mhz ie 4.444 clock . 

So any comments will be helpfull.
0 Kudos
10 Replies
Altera_Forum
Honored Contributor II
1,019 Views

I just replied to your earlier post, as to why clock skew is normal. One thing in your first screen shot, CLKCTRL_G4 means it's on a global, the 4th one. Also, it's driving an M20K, so if this is an input path, the timing won't be good. (An input pin to a M20K is going to have a long data delay). 

For outputs, the launch clock path in the FPGA will hurt setup time, as it takes longer to get your data out, but helps hold time(for the same reason). For inputs, the long clock delay helps setup, since it latches the clock later in time, but hurts hold. Also, I would strongly recommend a PLL at those speeds. 225Mhz for I/O timing is pretty fast. Do-able, but fast.
0 Kudos
Altera_Forum
Honored Contributor II
1,019 Views

Yes RYSC I followed your reply and yes you are perfectly right as to what you said in this thread. I only know the analysis but the reasons were not known. You mentioned about M20k , I will get information about this ram in altera but just want to have some words from you on this. 

Do i have a choice of avoiding this RAM from clock so that the delay is less and yes I will use the PLL inorder to get the timings correct.
0 Kudos
Altera_Forum
Honored Contributor II
1,019 Views

You only showed the clock going to the M20K(data required path). If the data arrival path is from an input port, then it will be a long data delay because the M20Ks are inside the die, not in the IO cell like a register is. So you'll have a long data delay. Generally that makes timing more difficult. You would most likely want to have a register on your inputs and outputs.

0 Kudos
Altera_Forum
Honored Contributor II
1,019 Views

Ok you mean my inputs and outputs should be registered. And yes one thing i forgot to mention that this is a module level analysis its not the integrated system level..Its a single block and i am doing analysis of that. 

You earlier guided me to make the outputs as virtual inorder to remove output buffers . 

Well the output timing was solved with some constraints and removing the buffers. 

The data delays as seen is around 4.6ns from input to reg where the IC delays are around 3ns for each and every input to reg path. 

If i remove the input buffers the IC delays are increasing and accounting for even worse data delays. 

The reason might be that at module level the module is scattered all over the FPGA and hence accounting for large IC delays . 

Can I contrain the IC delays (I know I am deviating from the thread topic but also know in the next thread i will have the answer from you :)).
0 Kudos
Altera_Forum
Honored Contributor II
1,019 Views

And this path was determined by the tool. I havnt guided the tool to use M20k Is there an alternative for this. (I am a newbie)

0 Kudos
Altera_Forum
Honored Contributor II
1,019 Views

Nothing wrong with an M20K on an internal path.  

Most users don't spend too much time on I/O analysis when looking at a sub-module. Note that your clock skew is going to disappear when it's hooked into the full design, because the other side it hooks up to will probably be fed by the same clock. So your constraints would have to account for this. There are just a lot of things that can't fully be taken into account, and so they usually work on everything but the I/O of the sub-module, and then when it's hooked together, they see what doesn't work. (They don't completely ignore the I/O boundaries of a sub-module. Register wherever you can so that it won't have long delays to/from the next module, or at least keep the logic as short as possible.) But I think you're spending too much time on that, when you'll be able to analyze it much better later on. If everything within each module meets timing with some margin, the interfaces will most likely(hopefully) not be too difficult.
0 Kudos
Altera_Forum
Honored Contributor II
1,019 Views

Oh thank you RYSC for your guidance. You have been a great help to me and yes will keep following your comments. :)

0 Kudos
Altera_Forum
Honored Contributor II
1,019 Views

HI RYSC 

 

 

--- Quote Start ---  

I would strongly recommend a PLL at those speeds. 225Mhz for I/O timing is pretty fast 

--- Quote End ---  

You recommended to use a PLL for high frequency operations can you explain me how it makes a difference in skew or slack if we use a PLL instead of a direct clock 

 

Thankyou
0 Kudos
Altera_Forum
Honored Contributor II
1,019 Views

When the clock drives a clock tree, the main advantage is that it is low-skew, and hence hits all the registers at approximately the same time. The actual delay though can be quite large. Say roughly 3-4ns(device and speed grade dependent). For an output I/O, a long clock tree hurts the setup analysis(think Tco) and helps the hold analysis(think Min Tco). For an input it does the opposite, helping the setup analysis(think Tsu) and hurting hold analysis. So where you want the clock to be is dependent on your I/O specs. If you have tight Tco requirements, then you want the clock path to be as short as possible. If you have tight Tsu requirements on the inputs, then you want it to be longer. Hold plays in the opposite. 

Just adding a PLL will compensate for the clock tree. This makes the clock path shorter. By itself, this can be a good thing or a bad thing, depending on the situation(although more often than not it's a good thing). 

The PLL also lets you phase-shift the clock. So on top of shortening it, you can phase-shift it wherever you want. So let's say you really wanted it long, you could just phase-shift the clock forwared by 3ns and get back to where you started. 

Finally, and this is important, the PLL makes the clock tree Process, Voltage, and Temp(PVT) calibrated. So, let's say you start with a 4ns global clock tree, and that is a good value for your design. That would be 4ns in the slow corner, but it might be 2ns in the fast timing corner. So your clock tree varies between 2-4ns over PVT. That's 2ns of margin that is completely lost because you can't account for it. 

Now add a PLL. Your clock delay might now drop to 0.5ns(I'm not going into how a PLL does that). Plus, it's PVT compensated, so it will stay at 0.5ns. So let's say you really wanted the original 4ns shift, so you manually add another 3.5ns shift to your PLL, getting you back to 4ns. With the PLL, it will be 4ns at both the slow and fast corners, so you don't lose 2ns of margin to PVT.
0 Kudos
Altera_Forum
Honored Contributor II
1,019 Views

Ok ok ...now I got you...Using a PLL will give me control over the skew and how much i want it to be shifted ..offcourse that 0.5ns margin is natural and also not in my control so i can phase shift my clock according to the requirements for inputs and outputs.. 

 

Thanks :)
0 Kudos
Reply