Connecting of two FPGAs in synchronous mode.

Altera_Forum · ‎10-10-2007

Hi all

I must say that this is a tough one. I'm working on it about two weeks.

I use a big board which includes three FPGAs. 2 Stratix-II180 and one Stratix-IIGX130.

The mid FPGA is connected to each side FPGA using 220 pins. The clock for the three FPGAs is balanced on board and each input is connected to PLL in normal mode. The main problem is that I use the three FPGAs as if they were one - which means that I don't transmit the data from FPGA to FPGA using source sync clock. All interconnect pins are sampled in both input and output side and the FFs are located inside the pads to reduce trace trace from FPGA to FPGA. The clock frequency of this system is 80MHz.

The problem - The data is corrupted in one section of the interconnectivity pins. I tried many options to fix that - using serial terminations, changing the current drive strength, phase shift to the PLLs and more. I get variance of results some are better and some are not. Till now after many tries I have one FPGA version who works and I can't say why. I took one of the versions of the problematic interface and added signalTap to the input pins in order to see what happens when the data is corrupted. Surprisingly, this is the only version who works.

I'll be happy to hear any experts advice for that.

Thanks a lot

Moti

Altera_Forum · ‎10-10-2007

Did you fully constrain the I/O timing? All the inputs and outputs of each FPGA need to have maximum and minimum input delay or output delay constraints. (tsu/th/tco/min tco in the Classic Timing Analyzer would work, but input delay and output delay is preferred for either timing analyzer.) Budget the available timing between the driving FPGA and receiving FPGA for each connection.

Are the PLLs doing a divide? If you are going from a register on a half-speed clock in the driving FPGA to a register on a half-speed clock in the receiving FPGA (for example), the half-speed clocks in the two FPGAs might be out of alignment by a full-speed clock period.

There is probably some significant clock uncertainty for going from a PLL domain in one FPGA to a PLL domain in another FPGA (even more than going between separate PLLs in the same FPGA). With TimeQuest, you can use virtual clocks for the input delay and output delay constraint clock fields and apply clock uncertainty settings for the transfer between a register on the PLL domain in a given FPGA to/from the virtual clock that effectively represents the clock of the external register that is actually in the other FPGA. An uncertainty setting done this way with virtual clocks won't affect the timing analysis of paths internal to an FPGA (which might themselves need an uncertainty setting, but with a smaller uncertainty value). If you have no idea what I mean by clock uncertainty settings and don't want to read the documentation about it, just make sure you have some positive slack for the I/O timing (I don't know how much is needed--maybe a few tenths of a nanosecond).

Altera_Forum · ‎10-11-2007

Brad

First, thanks for the fast response.

I use a different technique to do the timing budjet. These are the facts that I count on:

1. All output FFs are located inside the pads, which means that I have ~2ns delay from FF to pad.

2. All input FFs are located inside the pads and I use the input delay element in the minimum option (~0.5ns) - this gives me ~2ns delay total

3. The trace on borad is ~3inch which is about 0.5ns

4. The PLLs are in normal mode which means that the clock is locked to the input pins - this ensures that the clock tree starts at the same timing point in all FPGAs

5. I've measured the clocks relation from FPGA to FPGA using a scope and they look very well - ~1ns phase shift.

So the total delay is 2n + 2n + 0.5n +-(1n) which is 3.5-5.5ns delay.

This gives me at least 6.5ns setup time.

I prefer to use this way since when I lock all the paramters, I will probably not have variation in the versions results.

Is it correct? Do you think it's better to use virtual clock and timing constraints?

Moti

Altera_Forum · ‎10-11-2007

With manually forced I/O cell registers for both inputs and outputs and manually controlled input and output delay chains, the timing should be the same for every compile. In that case it is acceptable to do the I/O timing analysis by hand if you use reported numbers (not your estimates) in that analysis. But to be proper you should always in every design constrain every path for which the timing matters and designate every path as a false path for which the timing does not matter. Someone maintaining your design later (even you after you've forgotten what you've done) will benefit from those constraints as a form of documentation for what you intended. If you later do something to the design that makes a Fast Input Register or Fast Output Register setting no longer work (synthesis itself can cause this to happen), the reported negative slack will alert you to the resulting timing violation.

Even though you do not have timing constraints at the moment, if you are using the Classic Timing Analyzer you can report the tsu, th, tco, and minimum tco to see whether the timing is what you expected. Your description sounds like your manual estimate is using output-register-to-output-pin and input-pin-to-input-register delays where it should be using tco and tsu. tco and tsu include the clock-path portion of the timing inside the FPGAs, which also accounts for how much the PLL compensation delay differs from the ideal (the PLL output inside the FPGA is probably not exactly aligned with the clock at the PLL input device pin).

Be sure to check the timing with both the slow and fast timing models. You might have lots of margin for tco of the driving device going to tsu of the receiving device with the slow model but have a timing violation for minimum tco of the driving device going to th of the receiving device with the fast model. Remember to include that 1 ns difference from FPGA to FPGA for the clock input pins in the min tco/th check--that clock skew might be causing a hold violation at the receiving device.

Altera_Forum · ‎10-13-2007

Brad

I do use the timing constraints - I use max delay attribute to verify that my numbers are correct. Pad to input register is 2.5ns and output register defined as 2.5ns. Still it doesn't work well. I suspect that it is something regarding the high load on the power supply since I toggle so many pins simultanously. What do you say about that? Is it possibe? I tried to measure the voltage on the power supply capacitors but I haven't seen something wrong.

Moti