Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
20641 Discussions

Buffer chain output from DLL in ALTDLL

Altera_Forum
Honored Contributor II
4,793 Views

Hi, I require clocks with different phase shifts. Theoretically, the outputs at each stage of the buffers in the buffer chain of DLL in ALTDLL megafunction should do it. However, ALTDLL outputs only dll_delayctrlout and the buffer clock outputs are not available. Is there any way to access them? Thanks.

0 Kudos
12 Replies
Altera_Forum
Honored Contributor II
752 Views

I'm fairly certain the DLL is dedicated silicon for a specific function, and therefore only has one driver output. Why not use a PLL?

0 Kudos
Altera_Forum
Honored Contributor II
752 Views

DLL had delay elements (buffers) which shift the phase of the clock(as shown in the image attached). The application requires me to have a clock with 8 synchronized phase delays (i.e. 45,90,....,360). I am getting the desired frequency of the clock using PLL; however, to get these phase delays, I need to use a DLL.  

Yes, it is possible to achieve the same using PLL by having a high-speed clock and then using digital logic and get the required number of slower phase delayed clocks. I'm trying to avoid this option as PLL output at that high frequency is not permissible on the DE4 board. 

 

It would be much easier if I could somehow get the buffer outputs instead.
0 Kudos
Altera_Forum
Honored Contributor II
752 Views

Not sure what device you're looking at, but usually those delaytaps go directly into a mux and only one is selected. There is not an individual driver for each one into the fabric. For multiple phases, users generally take a PLL and have it's outputs drive 0/45/90/135 and then invert it(which is done for free at the LAB level) for the other phases. There is no digital logic, as the PLL has multiple counter outputs.

0 Kudos
Altera_Forum
Honored Contributor II
752 Views

Thanks for the answer! The suggested method will work, except for one problem: the inverters will add a finite delay in addition to providing 180 phase shift to 0/45/90/135 clocks. This delay might be a problem for the required high-speed application. 

Just for clarification, to implement 45/90/135 phase delays, I would need to drive three different DLL's using PLL output, right?  

The device I'm using is Stratix IV board, with Quartus prime standard edition software.
0 Kudos
Altera_Forum
Honored Contributor II
752 Views

No, no DLL at all. Just a single PLL with four outputs. 

What speeds? If I were to guess you plan on capturing your data input with 8 phase of the clock and having some logic to determine the right phase? The delay of a NOT gate is negligible. It's not done through a LUT but through a dedicated inversion going into the LAB, so it's probably less than 20ps. There are all sorts of other things that are going to hurt you. 

First, the PLL has to drive global clock trees. So if you drive four global clock trees, they will naturally have a decent amount of on-die variation, just because they are so large. Next, there are only three clock lines per LAB, so at most you're get 3/8 clocks into a LAB. So you'll really be driving three different LABs. There will be sizable delays in your datapath to each LAB, much larger than the NOT gate. Finally, if you can get it all correct and have minimal variation, you'll need to lock down the placement and routing. All very difficult. 

Is your IO standard LVDS? Could you use the dedicated altlvds silicon which can overclock it at a very high rate, and then once it's parallelized you can do what you want with the logic?
0 Kudos
Altera_Forum
Honored Contributor II
752 Views

There are many things here that I don't understand. 

Yes, the I/O standard is LVDS. The function I'm trying to implement is for sampling. For sampling, I am using the Terasic data conversion board, which gets sampling clock from the DE4 board using LVDS clock pins from HSMC-A connector. At each microsecond, I have to use one out of eight phase-delayed clocks (which are selected using a MUX) for sampling. That's why eight phase-delayed clocks need to be synchronized. Can you please suggest any method to implement this scheme. 

I am not sure if the altlvds megafunction will work, as it serializes the data and sends at very high frequency while I need to select a phase-delayed 500MHz clock at each microsecond.
0 Kudos
Altera_Forum
Honored Contributor II
752 Views

I'm confused between your time delays. The 8 clocks are running at what frequency, 500MHz, and then phase-shifted 25ps from each other? If that's right, what is happening every microsecond? Just a new clock is selected in case the data edge shifted?

0 Kudos
Altera_Forum
Honored Contributor II
752 Views

I select one out of eight clocks every microsecond and sample the incoming signal for 200ns. Eight phase-shifts results in 250ps delay b/w clocks

0 Kudos
Altera_Forum
Honored Contributor II
752 Views

Sorry, 250ps, not 25ps. Still, that granularity is really fast, and I don't think you will achieve it with generic structures. Let's pretend you were capturing the data with a single clock(which in many ways is much better since it would have minimal variation in delays), and it would require a 4GHz clock to get the same number of samples. Instead you're trying to build this with generic FPGA structures which have a lot more variation. 

Your on-die variation on the clock trees will probably be large enough to swamp out 250ps. Add that things will not be perfectly laid out(the clock trees are not identical down to the last ps, the data paths will be different to the different labs, etc.).  

The reason I suggest altlvds is that it uses all dedicated silicon and therefore has much tighter controls. Basically do a altlvds_rx with /8 deserialization, and you'll basically get 8 bits of data. But that won't run at the speeds you want either. At best you might be able to do 1500Mbps sampling (so 3 samples for every 2ns data bit instead of 8 samples). That may not be enough for what you want though.
0 Kudos
Altera_Forum
Honored Contributor II
752 Views

Hi, the main reason behind sampling with clocks with different phases was the sampling rate limit of the ADC. From what I understand, for using altlvds, I need to sample it at 1500Mbps which won't be possible. The best option I had was to have a locked DLL and get the clocks out from the cascaded buffers, but it turns out that I can't do it.  

 

To increase the duration between the phase-delayed clocks, I can reduce the clock frequency to a minimum of 150MHz and have 8-12 phase-delayed clocks. Thus, I can increase the time gap to 800ps. The processing after I receive the sampled signal is relatively simple; thus, I am not constrained by the speed of input to the DE4 board. Can I implement it with the PLL scheme you had suggested? Please let me know if you find any better options.  

 

Thanks. :)
0 Kudos
Altera_Forum
Honored Contributor II
752 Views

If you go down to 150MHz, then a lot more is possible. I'm guessing you can do it with PLL outputs and it should work. But you do need to monitor the timing on the place-and-route, as it can change compile to compile.  

When you say altlvds at 1500Mps isn't possible, how come? Note that altlvds has a PLL in it, so you could, for example, drive it with a 150MHz clock, then internally that would crank it up 8x to a 1200Mbps sampling clock, and then deserialize it /8 back to 8-bits of data coming out at 150MHz. Since it's dedicated silicon, it would all be laid out properly and not change compile to compile.
0 Kudos
Altera_Forum
Honored Contributor II
752 Views

Oh, okay. I am going to try the altlvds_rx first.  

I am grateful that you spent your time on this discussion. Thanks a lot!
0 Kudos
Reply