Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Altera_Forum
Honored Contributor I
745 Views

Wit's end: Cyclone III PLLs lose lock when more than one FPGA connected?

Please forgive me, I'm at wit's end here. Any suggestions welcome. 

 

I've got a very dense Cyclone III design (~90% utilization) with a very large number of DDR I/Os (>250 pins), using all four PLLs (in order to spread out the power consumption by phase-shifting unrelated logic). 

 

Surprisingly, it works great. No errors, no bit-flips, no PLL lock lost. 

 

The cyclone and the peripherals it talks to are on a card; eight of these cards plug into a backplane which supplies power, 10mhz base clock, and JTAG. I can use any card in any slot and things work great. if i so much as plug in a second card (nevermind programming it!) all four plls on the first card immediately lose lock (output "locked" glitches low) and misbehavior begins. There are no high-speed signals between the cards and the backplane (fastest toggle is the 10mhz base clock). 

 

This baffles me because the unprogrammed card draws hardly any power, and nothing on it is switching. I've probed the shared 10mhz clock, vccA, vccint, and vccio on a scope, and nothing budges by more than 5mV, which is less than 1%. In particular the clock waveform is the nice smooth curve you'd expect. Everything is decapped out the wazoo (excessively, perhaps... over 1200uF of bulk capacitance per FPGA on each of vccint+vccio, plus two dozen 1uF+0.1uF ceramics and a 47uF tantalum for good luck). 

 

I'm at a loss here. I've gone through the list of reasons for PLL lock loss and none of them apply -- if any of them were the cause, then one card wouldn't work in isolation -- but it works great! 

 

Very puzzled. Is there any other advice out there on troubleshooting PLL lock loss besides the checklist at this link? 

 

https://www.altera.com/support/support-resources/operation-and-testing/pll-and-clock-management/pll-... 

 

Edit: I should also add that it isn't the "plugging" action that's responsible -- if I power up the system with two cards physically installed, then attempt to program one of them, I get the same PLLs-frequently-losing-lock behavior. So any electrical noise caused by the physical action of inserting the second card into the slot can't explain the failures.
Tags (2)
0 Kudos
6 Replies
Altera_Forum
Honored Contributor I
33 Views

Is the base clock wave somehow affected by the insertion of more than one board? 

I mean, do you observe any change in duty cycle, level or wave shape compared to the case when you have one single board? 

Does the base clock travels a long path in the backplane? Do you use any termination on the trace? 

Another guess: is possible that the big bulk capacitance is actually the reason of the problem? The added capacitance could slow down the power supply risetime and prevent correct resetting of the PLL circuitry.
Altera_Forum
Honored Contributor I
33 Views

 

--- Quote Start ---  

Is the base clock wave somehow affected by the insertion of more than one board? 

I mean, do you observe any change in duty cycle, level or wave shape compared to the case when you have one single board? 

--- Quote End ---  

 

 

Definitely not; that's what I meant when I wrote the following, although perhaps I wasn't making it clear: 

 

 

--- Quote Start ---  

I've probed the shared 10mhz clock, ... on a scope, and nothing budges by more than 5mV, which is less than 1%. In particular the clock waveform is the nice smooth curve you'd expect. 

--- Quote End ---  

 

 

The presence of the second card definitely doesn't affect the clock waveform by more than 1%. That trace is driven by a pretty powerful buffer/line-driver IC (not directly by the clock can). 

 

 

--- Quote Start ---  

Does the base clock travels a long path in the backplane? 

--- Quote End ---  

 

 

Yes. Total trace length on the backplane is around 180mm, plus eight card-edge connectors and another 30mm or so on each plugged-in card. That's a lot, but keep in mind it's only a 10mhz clock. 

 

 

--- Quote Start ---  

Do you use any termination on the trace? 

--- Quote End ---  

 

 

No, mainly because it's such a slow clock. Probing doesn't show any ringing at all. 

 

Also, nothing is actually synchronous to that 10mhz clock; it serves only as a source for the FPGA PLLs to produce the high-speed DDR clocks. Communication with the "outside world" is entirely via JTAG at low speeds (~500kbit/sec, synchronous to TCK, which is terminated on each board). 

 

 

--- Quote Start ---  

Another guess: is possible that the big bulk capacitance is actually the reason of the problem? The added capacitance could slow down the power supply risetime and prevent correct resetting of the PLL circuitry. 

--- Quote End ---  

 

 

Hrm, I don't think so; the problem occurs even if I power up the system first (i.e. charge up all those huge caps; the 1200uFs are on the backplane) then plug in the cards after power-up.
Altera_Forum
Honored Contributor I
33 Views

(posting this here in case others run into the same problem -- for every question I ask on this forum it seems I solve ten by reading the archives... it's an amazing resource to have, especially for somebody who had "brand X" drilled into their head in grad school) 

 

Unfortunately I can't say I've solved the problem, and it's still deeply disconcerting. However I did manage to establish that the 10mhz base clock is involved in the problem. The PCB has one of the GCLK input pins tied back to an output pin, so I hooked up that output pin to the 60mhz-ish internal oscillator (altint_osc), brought it back onto the chip (why oh why do you make us send the signal off-chip?), sent it through a PLL with an obscenely low bandwidth (Ipump=0, Rfilt=30, Cfilt=3). 

 

It's rock solid. Even coming from an internal oscillator. 

 

Periodically measure the actual frequency from off-chip and reconfigure the PLL to bump the multiplier up/down as needed in response to temperature changes although so far that has not been necessary (and one could even argue that you want the clock to slow down when the room warms up). 

 

So it's fixed for now, but I'm still deeply disturbed by the fact that the PLL somehow couldn't hold a lock on the 10mhz clock. At such low speeds it ought to be possible to trust what you see on even a low-end 100msps oscilloscope, no? Because the waveform was the nice clean shape you'd expect. Is the fact that this input is so slow a source of trouble? The internal oscillator is ~6x faster. 

 

Oh well. It's fixed for now, but if anybody has further insights I'm still interested. I'd like to be able to go back to the quartz clock in the future if necessary. Unfortunately respinning the PCB to give each FPGA its own private oscillator isn't really an option.
Altera_Forum
Honored Contributor I
33 Views

The minimum input frequency for Cyclone III PLLs is 5MHz, so your 10MHz is not too slow. The problem could be with a limited PLL locking range which depends on the output frequency you require. I guess a high frequency difference between in and out could lead to a limited locking range. 

In the Quartus compilation report you should find the min and max frequency values. If, for example, the resulting locking range is 9.95MHz to 15MHz, the PLL lock with Fin=10MHz could be very unreliable. 

In any case you can try to cascade two PLLs if you have spare resources: the first one would raise clock frequency from 10MHz to a conveniently high intermediate frequency that the PLL can easily manage, say 100MHz; then you tune the second PLL to the required frequency. 

 

Addendum: 

I just tested with a fake design here. I set 10MHz input frequency and 433MHz output. The resulting PLL locking range is 5.4 to 10 MHz: so, a small Fin drift away from 10MHz could lead to the locking problems you experienced
Altera_Forum
Honored Contributor I
33 Views

 

--- Quote Start ---  

I guess a high frequency difference between in and out could lead to a limited locking range. 

--- Quote End ---  

 

 

Hi Cris, the Cyclone III PLL (and most PLLs for that matter) don't have direct constraint on the difference between the input and output frequency. 

 

There's a limited range for the VCO, but even at its slowest (600mhz) it runs faster than the clock network is able to. 

 

Basically the PFD nudges the VCO to a near-gigahertz multiple of the input clock, and then the output counters divide that down. The output counters have no effect on the VCO, so the output frequency does not affect the PLL's ability to lock (the VCO parameters do, of course affect it). 

 

 

--- Quote Start ---  

In the Quartus compilation report you should find the min and max frequency values. 

--- Quote End ---  

 

 

Yeah, I checked those. Claimed a lock range of 5.60-12.04mhz. I'm inside by 20%. 

 

 

--- Quote Start ---  

In any case you can try to cascade two PLLs if you have spare resources: the first one would raise clock frequency from 10MHz to a conveniently high intermediate frequency that the PLL can easily manage, say 100MHz; then you tune the second PLL to the required frequency. 

--- Quote End ---  

 

 

Yeah, I tried that, the first PLL loses lock.
Altera_Forum
Honored Contributor I
33 Views

quantized - 

 

Have you looked at everything that is affected when a second card is plugged in? Before an FPGA is configured all of its programmable I/O are tri-stated and weakly pulled high by internal pullups. Are there any common signals that could be affected by this? How about the programming pins? Anything shared between cards or does each card configure itself? What you described does not make any sense if the 10Mhz clock is not affected by the second card and there are no other interactions that could cause the problem. There is obviously a cause and effect here. You just have to figure out what it is. 

 

What all can cause a PLL to lose lock? Loss or corruption of the input clock, reset (explicit or due to loss of configuration such as due to trip of POR circuit), power supply noise, etc. Identify all of the possibilities and then look at all the possible ways the second card could be a factor. 

 

Interesting problem!
Reply