Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
20693 Discussions

Can an FPGA partially fail? (EPM7128SLI84-10)

Altera_Forum
Honored Contributor II
1,188 Views

Hi All, 

 

I have a production run of 110 plug in modules that each use a single EPM7128SLI84-10, two of these devices are suspected of failing as the modules show exactly the same fault in that 8 LEDs that the FPGA drives are permantly lit, normally they are off (on power-up anyway). It's easy to assume that the FPGA has gone 'POP' but other functions that the FPGAs provide are working properly.  

 

Is it possible that just part of the FPGA has failed? If so is there any reason why it would be the same part, such that the same fault is shown on both failed modules? I've checked that there is +5V on each 5V pin and that all the 0V pins are connected to 0V. 

 

You can tell that this isn't a new design, in fact they have been in a UK Inter-City express locomotive for the last 16 years so a couple of (possible) failures is quite acceptable. The faults have only come to light recently so we are having to buy a new programming module since we don't have Windows 95, a 25 way parallel printer port, or an RS232 port on any of our PCs. 

 

We are hoping that the programming software will tell us if there is a fault with the FPGA, if there is, and a 'chunk' of the FPGA has indeed failed is there a way of programming it so that the faulty part isn't used? Our customer (who has the modules now) is not really geared up to swap out the chips - a 100W soldering iron seems to be quite destructive! 

 

Cheers 

James
0 Kudos
9 Replies
Altera_Forum
Honored Contributor II
439 Views

partial failure may result from timing margin issues. If the problem is at powerup then I will also suspect some floating inputs. Try touching the area of chip and see effect but beware of static damage.

0 Kudos
Altera_Forum
Honored Contributor II
439 Views

In terms of programming - usb-> RS232 connectors are easy and cheap to get hold of, and older devices were supported up to Quartus 9 - so you should be able to use a USB blaster cable (it may work with windows 7, but I dont think it officially supports it. Otherwise XP or Linux should do the trick). This is of course unless you're using some eproms that arnt programmed from quartus. 

 

I would be careful though - if it was compiled on Max Plus II, I would stick with that. Ive had quartus fail with illegal synthesis options for an Old Flex10k design, that compiled just fine in Max Plus 2.
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

 

--- Quote Start ---  

partial failure may result from timing margin issues. If the problem is at powerup then I will also suspect some floating inputs. Try touching the area of chip and see effect but beware of static damage. 

--- Quote End ---  

 

 

Hi 

 

Thanks for the reply which we will try. 

 

When you say failure - do you mean physical failure or functional failure? The LEDs are driven by a simple latch circuit which does not operate at any great speed, nor does the FPGA do anything really 'clever'. There is a push button that clears the latch, so even if it was set by a timing issue the button should clear it. This button is one of the working features as it connects through the FPGA to a diagnostic LED which works OK. 

 

Cheers 

James
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

 

--- Quote Start ---  

In terms of programming - usb-> RS232 connectors are easy and cheap to get hold of, and older devices were supported up to Quartus 9 - so you should be able to use a USB blaster cable (it may work with windows 7, but I dont think it officially supports it. Otherwise XP or Linux should do the trick). This is of course unless you're using some eproms that arnt programmed from quartus. 

--- Quote End ---  

 

 

Hi  

 

We have tried a USB to 232 converter with Max Plus + II, which was the original software used. As expected we had enormous trouble with 'drivers' for both items and even the Max Plus S/W messed around for a while before we could beat it into submission (apparently by simply reloading it continuously). We gave up eventually and our customer now has the gear and is looking to buy 'new' hardware and software - hopefully that is driven directly from a USB port! 

 

Cheers 

James
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

 

--- Quote Start ---  

Hi 

 

Thanks for the reply which we will try. 

 

When you say failure - do you mean physical failure or functional failure? The LEDs are driven by a simple latch circuit which does not operate at any great speed, nor does the FPGA do anything really 'clever'. There is a push button that clears the latch, so even if it was set by a timing issue the button should clear it. This button is one of the working features as it connects through the FPGA to a diagnostic LED which works OK. 

 

Cheers 

James 

--- Quote End ---  

 

 

I meant functional failure of its switch logic due to timing failure. If indeed it is a latch (unclocked) then that is not recommended and is better changed clocked logic s that it can be checked correctly by timing tool.
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

 

--- Quote Start ---  

Hi  

 

We have tried a USB to 232 converter with Max Plus + II, which was the original software used. As expected we had enormous trouble with 'drivers' for both items and even the Max Plus S/W messed around for a while before we could beat it into submission (apparently by simply reloading it continuously). We gave up eventually and our customer now has the gear and is looking to buy 'new' hardware and software - hopefully that is driven directly from a USB port! 

 

Cheers 

James 

--- Quote End ---  

 

 

Did you try with Quartus? if you're not recompiling, then qaurtus (up to V9) should work just fine.
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

 

--- Quote Start ---  

Did you try with Quartus? if you're not recompiling, then qaurtus (up to V9) should work just fine. 

--- Quote End ---  

 

 

Hi 

 

Thanks for the info, haven't tried Quartus. it's good to know that it gives us another option for trying out reprogramming the FPGA. We won't be recompiling as the design is mature. 

 

Just to re-iterate one of the original questions, is it possible to 'lock-out' any damaged sections of the FPGA? I realise that this would mean recompiling but our biggest problem is replacing the FPGA, we can afford to do it, but don't want to have to send it away to be done if we can help it. Though even as I type this I'm thinking that it would cost more to get someone up to speed on the software as opposed to the cost of a 3rd party rework! 

 

In any event we will need to be able to reprogram so this has been a useful post for us. 

 

Cheers All 

James
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

From my understanding - an FPGA should just either work or not work. If theres something failing then I, like Kaz, assume it's a timing problem, or could it be other parts failed on the board? 

There is no way to "lock out" a failed part of the FPGA as you would have no way to know which cells have failed, and I would assume more cells would just fail after. 

Can you not just try a replacement/swapped board?
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

 

--- Quote Start ---  

From my understanding - an FPGA should just either work or not work. If theres something failing then I, like Kaz, assume it's a timing problem, or could it be other parts failed on the board? 

There is no way to "lock out" a failed part of the FPGA as you would have no way to know which cells have failed, and I would assume more cells would just fail after. 

Can you not just try a replacement/swapped board? 

--- Quote End ---  

 

 

Hi 

 

I've assumed that with large areas of silicon you can't easily send the +5V or 0V around everywhere from a single connection without serious voltage drops - hence the need for several 5V and 0V connections. I don't know however if these 'assumed' sections of the silicon are connected - hence we checked that all the 5V connections were intact externally - but they may have fused internally. All conjecture, but the logic behind the thought that perhaps one of these siloicon areas had failed. 

 

The module is quite simple - the only other part we can't check easily is a microcontroller, but currently the main suspect is the FPGA - we need to check this with the programming software now as we are scratching our heads! 

 

I'll report back when we get to program or at least attempt to! 

 

Cheers 

James
0 Kudos
Reply