Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
21221 Discussions

Cyclone IV GX Oscillator failures

bob_bitchen
New Contributor I
2,738 Views

We have an oscillator driving clock input through a 22 ohm resistor. The trace is very very short.


We produced 30 boards and had 6 oscillators fail. 

 

We have done many PCIe designs with the CIVGX and have not seen this problem. All of the previous designs use a 50 MHz oscillator with a reconfig pll that generates 50 and 125 for the PCIe hard IP. We switched this one to 64 to improve the accuracy of an 8MHz clock output and forgot to change the reconfig pll inclk settings. The PCIe actually worked until a board failed. then we fixed all of the plls that we had programmed.


Replacing the oscillator fixes the board temporarily. The oscillators fail again several hours to a day later. We have tried several different types and manufacturers of the oscillators. 


I replaced the 22 ohm resistor with a 220 in an attempt to characterize the output and input. The FPGA operates normally and doesn't seem to present much of a load. I can see the effect of a scope probe on the input, and the FPGA input is not much more.


Both oscillators with the 220 ohm resistors have eventually failed.

The oscillator and the FPGA are powered from the same 3.3V supply coming from a PCIe slot in a computer and are located directly next to the gold fingers.

I don't see any power supply noise, and 3 different test environments have been used.

I don't see any kind of output characterizations or detailed data for the oscillator.

Any help here would be greatly appreciated.

 

 

Labels (2)
0 Kudos
1 Solution
bob_bitchen
New Contributor I
2,483 Views

 

1) 50 MHz oscillator with 50 MHz design in the FPGA was tested w/ no failures. 10 of these units are being used in the field.

2) During qualification testing, it's discovered that an 8 MHz output is actually 8.006 MHz. The internal divider is set to 32/25 and the output should be exact. I couldn't fix this in the FPGA. 

3) I change the 50 MHz oscillator to 64 MHz and the PLL to 1/1 and now my 8 MHz is really 8 +/- 50 ppm.

4) All of the boards that I have are re-tested and a few are sent to be used.

5) One of the boards that I have and one in the field show no activity on the PCIe. Then 2 more.

6) The PLL that drives the PCIe still has an INCLK value of 50 MHz. The downstream values of the PLL should be 50, 125, 312.50, and 1250 but are actually running at 64, 160, 400, and 1600. I fixed the INCLK value of that PLL, but the boards do not recover.

7) The failed boards have little or no amplitude on the 64 MHz, so we replace those oscillators. The oscillators all fail within 24 hours.

We bought several different kinds of oscillators -- same results. -- The oscillators all fail within 24 hours.

9) I replaced the 22 ohm series termination with 220 -- same results. -- The oscillators all fail within 24 hours.

10) I contacted EPSON and Abracon engineers who told me that the oscillators will tolerate a short circuit indefinitely and recover when the short is removed. 

11) I replaced 2 FPGAs so far and they have been in continuous operation for 3 days now. I'm sending 2 more out to be replaced.

12) I have 11 boards that have never failed in a burn-in environment and have been running for 3 days.

 

So, YES the FPGAs were damaged and somehow killed bulletproof oscillators along the way.

In the timing analyzer, the higher clock values are shown to be derived. Other than that, there was no way to tell that my chips were time bombs in the field.

 

 

 

 

View solution in original post

0 Kudos
13 Replies
FvM
Honored Contributor II
2,718 Views
Hi,
if the oscillator fails (not sending clock) it's a problem of the oscillator device, not particularly related to FPGA. There might be a problem with expected connection of oscillator enable pin. Although most devices have internal pull-up/-down, some require external connection.
0 Kudos
bob_bitchen
New Contributor I
2,702 Views

These are active high enable devices. It is tied high.

We have tried several manufacturers and models in the same spot.

 

0 Kudos
_AK6DN_
Valued Contributor II
2,688 Views

Very strange.

Is this a commercial board or your own design?

Is the location of the oscillator a hot spot on the board?

Are the oscillators SMT or DIP? I am guessing SMT.

Were they hand soldered onto the board or soldered in a manufacturing reflow process?

If you take one of the failed oscillators and mount on a standalone test fixture is it still dead?

Are you considering sending the failed part(s) back to the manufacturer or to a test house for failure analysis?

It would be interesting to know what part of the oscillator failed. Is it the internal oscillator, or the output driver?

0 Kudos
bob_bitchen
New Contributor I
2,685 Views

 

Is this a commercial board or your own design?

> One of many that we do. We have done many PCIe designs with the CIVGX and have not seen this problem.

Is the location of the oscillator a hot spot on the board?

> This one is a PCIe add-in card. It's right next to the fingers.

Are the oscillators SMT or DIP? I am guessing SMT.

> SMT

Were they hand soldered onto the board or soldered in a manufacturing reflow process?

> They were all replaced by hand.

If you take one of the failed oscillators and mount on a standalone test fixture is it still dead?

> I remove the resistor and check the output. It's the same.

Are you considering sending the failed part(s) back to the manufacturer or to a test house for failure analysis?

> That's probably going to take too long. Yes.

It would be interesting to know what part of the oscillator failed. Is it the internal oscillator, or the output driver?

0 Kudos
bob_bitchen
New Contributor I
2,683 Views

Images:

0 Kudos
bob_bitchen
New Contributor I
2,651 Views

Replaced 2 FPGAs. Both boards stopped killing the oscillators. It looks like a %28 change in INCLK value can permanently damage the part.

 

 

0 Kudos
FvM
Honored Contributor II
2,630 Views
Hi,
what is "%28 change in INCLK value" in commonly understood technical terms? You are talking about frequency, voltage?
0 Kudos
bob_bitchen
New Contributor I
2,622 Views

a 50 mhz oscillator was replaced with a 64 mhz before re-programming the pll and the FPGA was permanently damaged. 

0 Kudos
_AK6DN_
Valued Contributor II
2,614 Views

"a 50 mhz oscillator was replaced with a 64 mhz before re-programming the pll and the FPGA was permanently damaged".

What?

In your first post you said:

"Replacing the oscillator fixes the board temporarily. The oscillators fail again several hours to a day later. We have tried several different types and manufacturers of the oscillators. "

 

So which is it? Bad FPGA or bad oscillators? Or something else?

50MHz and 64MHz are both well within the input spec of the FPGA clock input.

I still don't think you have root caused the failure.

0 Kudos
bob_bitchen
New Contributor I
2,484 Views

 

1) 50 MHz oscillator with 50 MHz design in the FPGA was tested w/ no failures. 10 of these units are being used in the field.

2) During qualification testing, it's discovered that an 8 MHz output is actually 8.006 MHz. The internal divider is set to 32/25 and the output should be exact. I couldn't fix this in the FPGA. 

3) I change the 50 MHz oscillator to 64 MHz and the PLL to 1/1 and now my 8 MHz is really 8 +/- 50 ppm.

4) All of the boards that I have are re-tested and a few are sent to be used.

5) One of the boards that I have and one in the field show no activity on the PCIe. Then 2 more.

6) The PLL that drives the PCIe still has an INCLK value of 50 MHz. The downstream values of the PLL should be 50, 125, 312.50, and 1250 but are actually running at 64, 160, 400, and 1600. I fixed the INCLK value of that PLL, but the boards do not recover.

7) The failed boards have little or no amplitude on the 64 MHz, so we replace those oscillators. The oscillators all fail within 24 hours.

We bought several different kinds of oscillators -- same results. -- The oscillators all fail within 24 hours.

9) I replaced the 22 ohm series termination with 220 -- same results. -- The oscillators all fail within 24 hours.

10) I contacted EPSON and Abracon engineers who told me that the oscillators will tolerate a short circuit indefinitely and recover when the short is removed. 

11) I replaced 2 FPGAs so far and they have been in continuous operation for 3 days now. I'm sending 2 more out to be replaced.

12) I have 11 boards that have never failed in a burn-in environment and have been running for 3 days.

 

So, YES the FPGAs were damaged and somehow killed bulletproof oscillators along the way.

In the timing analyzer, the higher clock values are shown to be derived. Other than that, there was no way to tell that my chips were time bombs in the field.

 

 

 

 

0 Kudos
bob_bitchen
New Contributor I
2,489 Views

It's bad FPGAs.

The FPGAs failed in a way that was surprising.

In the original design, the 50 MHz clock drove a qsys pll that used 32/25 as a divider to produce an 8 MHz output. This divider said that it was exact in the GUI, however it was off by about 1/1000.

In the second iteration of the design the 64 MHz clock used a divider of 1 and the 8 MHz output was correct to 50 ppm. The PCIe pll was supposed to produce 50, 125, 312.50 and 1250 outputs. They were actually 64, 160, 400, and 1600, and the PCIe was functioning.

There were no indications of a problem unless you looked at the derived clocks from the timing analyzer and noticed the values to be wrong.

In the third design, the INCLK of the PCIe pll was fixed to be 64.

I communicated with engineers from EPSON and Abricon and both told me that the devices are capable of driving a short circuit indefinitely and then fully recovering when the short is removed.

We have replaced 2 FPGAs and they have been operating for 3 days each. I sent 2 more out to be replaced.

I have also put a handful of boards that have never failed into a burn-in environment, and they have been in operation for the same amount of time.

Somehow, the FPGAs have managed to damage bulletproof oscillators while failing in a very obscure way.

 

 

 

0 Kudos
_AK6DN_
Valued Contributor II
2,454 Views

My reading of the timeline is that you compromised  or damaged either the FPGA or OSCILLATOR or both during the first rework process to replace the 50MHz parts with 64MHz parts. The initial boards (built by a PCB assembly vendor I expect) worked 100%.

Only after rework of the 50MHz to 64MHz parts did failures occur.

My conclusion would be damage during that rework, could be an ESD event, high temperature, or something related. I am guessing the rework was done by hand by a local tech.

Subsequent rework replacing oscillators and FPGAs was OK. I am guessing that rework was done by a specialty rework tech, as BGA rework requires special equipment and skills.

0 Kudos
bob_bitchen
New Contributor I
2,371 Views

Whatever the cause, the FPGAs failed in a way that they were functional and destroyed good oscillators until the FPGA was replaced. 

 

0 Kudos
Reply