Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
17254 Discussions

Cyclone V readout of 300MHz DDR interface not meeting timing

Altera_Forum
Honored Contributor II
2,339 Views

Hi all, 

 

I'm trying to use the Cyclone V SoCKit from Terasic (http://www.terasic.com.tw/cgi-bin/page/archive.pl?language=english&no=816) to read out a daughter card containing an LTC2153-14 ADC (http://www.linear.com/product/ltc2153-14). The ADC interface is source synchronous DDR, running at 300MHz, so 600Mbps. I'm using the ALTDDIO_IN module to translate the ADC's 7-bit DDR words into 14-bit parallel words inside the FPGA. The design is failing timing in TimeQuest, and I'm not sure how to proceed.  

 

My timing constraint file is below. A reference design showing this timing failure is also attached.  

# SDC file for this bare-bones version, just to see how fast the ADC_DDR runs# Create the reference clock create_clock -name r_clk -period 50MHz # ########################################################################################################### Constrain the LTC2153-14 ADC DDR I/F:# Identify a virtual clock as the "launch clock" (i.e., clock the data valid edges). create_clock -name vclk -period 300MHz # Identify ADC_ENC # From datasheet, Tenc-to-data_valid_low# min = 1.7ns# typ = 2.0ns# max = 2.3ns# So, the rising edge occurs -2.0ns typ before vclk falling edge (i.e, the rising edge is at -2.0 - Tc/2)# Since -waveform doesn't allow negative numbers, so we advance by Tc until we get a positive edge create_clock -name adc_enc -period 300MHz -waveform { 3.000 4.666 } # Identify ADC_CLKOUT as the input clock # From datasheet, Tskew = 0.4ns# So, rise edge occurs at -0.4ns before vclk rising edge# Since -waveform doesn't allow negative numbers, advance by 360deg = Tc create_clock -name adc_clkout -period 300MHz -waveform { 2.933 4.600 } # Identify the DDR latch clock, which is a same edge transfer at the center of the data eye# Want to latch data at center of data-valid period. # Nominally, 360deg x (Tc/4 + Tskew)/Tc create_generated_clock -name ddr_clk -source .gpll~PLL_OUTPUT_COUNTER|vco0ph}] .gpll~PLL_OUTPUT_COUNTER|divclk}] -phase 135 -add derive_pll_clocks derive_clock_uncertainty # From datasheet, Tenc-to-data_valid_skew (i.e., ADC_ENC to ADC_D skew):# min = 0.30ns# typ = 0.40ns# max = 0.55ns# So, max delay = 0.55ns - 0.40ns = 0.15ns# min delay = 0.30ns - 0.40ns = -0.10ns set_input_delay -clock vclk -max 0.15 set_input_delay -clock vclk -min -0.10 set_input_delay -clock vclk -max 0.15 -clock_fall -add_delay set_input_delay -clock vclk -min -0.10 -clock_fall -add_delay # Add a false path statements for edges that should not be analyzed (this one is easiest to see with# a timing diagram). set_false_path -setup -rise_from -fall_to set_false_path -setup -fall_from -rise_to set_false_path -hold -rise_from -rise_to set_false_path -hold -fall_from -fall_to # Don't analyze some paths set_false_path -from -to * set_false_path -from -to * set_false_path -from -to *  

 

Note that I'm using the "Explicit Shift" method discussed in Rysc's Source Synchronous Timing guide on the altera wiki (http://www.alterawiki.com/wiki/file:source_synchronous_timing.pdf), which has been very helpful. I feed the 300MHz adc_clkout signal (i.e., the parallel clock coming in with the data) to an altera_pll run in source synchronous mode, and apply a 135deg phase shift in order to generate ddr_clk, which clocks the ALTDDIO_IN registers. Note that 135deg phase shift is used instead of a simple 90deg phase shift because the ADC's datasheet says the data transition edges are 0.4ns after the adc_clkout edges, so 135deg is necessary to center the latching clock on the data eye.  

 

I need help understanding if I'm doing the timing analysis incorrectly (n.b. this is the first time I've worked wtih a source sychronous I/F, and I'm worried I'm missing something obvious), or else if the Cyclone V IOE registers just aren't fast enough for 600Mbps. If the latter, could others provide suggestions on how to make the design meet timing? 

 

Any help is greatly appreciated!
0 Kudos
5 Replies
Altera_Forum
Honored Contributor II
1,595 Views

I need to update that document. The DDIO are not getting faster in newer generations, and are actually probably slower. (The problem is that performance is not dictated by raw speed, but how much variation there is in paths, which the smaller geometries seem to have more of). That being said, all 28nm families have dedicated altlvds hardware to help overcome this. These should run much faster. Cyclone V devices are spec'd in the Data Sheet at 875, 840 and 640 Mbps depending on speed grade. So instantiate the altlvds block.  

Also note that the timing is completely different with this hardware. I think the ssync user guide talks about this. It should be much easier, as most of the time people don't apply any constraints and just look at the Report RSKM(also explained in the Cyclone V Data Sheet)
0 Kudos
Altera_Forum
Honored Contributor II
1,595 Views

 

--- Quote Start ---  

I need to update that document. The DDIO are not getting faster in newer generations, and are actually probably slower. (The problem is that performance is not dictated by raw speed, but how much variation there is in paths, which the smaller geometries seem to have more of). That being said, all 28nm families have dedicated altlvds hardware to help overcome this. These should run much faster. Cyclone V devices are spec'd in the Data Sheet at 875, 840 and 640 Mbps depending on speed grade. So instantiate the altlvds block.  

Also note that the timing is completely different with this hardware. I think the ssync user guide talks about this. It should be much easier, as most of the time people don't apply any constraints and just look at the Report RSKM(also explained in the Cyclone V Data Sheet) 

--- Quote End ---  

 

 

Thanks! I'm starting to looking into the altlvds block now. 

 

Two questions: 

 

1.) How does the SERDES factor used with altlvds change the timing margin of the interface? My application on the Cyclone V SX only really needs a SERDES factor of 2, but I could split the datastream with a larger SERDES factor of, say 4, or 8, if there is a timing advantage to larger SERDES factors. 

 

2.) Are the allowable physical pin locations for the altlvds interface more restrictive for the dedicated LVDS hardware than for the DDR hardware in the IOEs? I already have a daugther card with traces corresponding to pin locations I've chosen for this project, and would like to try to get this particular board to meet timing, if possibile...
0 Kudos
Altera_Forum
Honored Contributor II
1,595 Views

1) SERDES of 2 results in a DDIO implementation, so that won't work. You need to go with a SERDES of 4 or higher and you'll get the dedicated logic. (Are you running your Cyclone V core at 300MHz? Do-able, but not easy.) The SERDES of /2 limitation didn't used to be a limitation because the rates you need the real SERDES logic for were so high you would need to do a higher deserialization rate anyway. There's kind of a donut hole now where in theory you could be required to use SERDES, yet your fabric is fast enough it could handle /2 internally. It doesn't happen often, but I recommend just going down to /4 and immediately muxing it back up to /2 data rate. 

 

2) Yes, but I believe they're available on all dedicated LVDS receivers, so if your board is using the LVDS receivers, the SERDES logic should be there. (It can be a little more complicated than that, i.e. the SERDES clock can't drive LVDS receivers on opposite sides of the die, but your board probably isn't laid out like that anyway). Throw it down in your design, put a deserialization of /4(you don't even have to hook up the extra 2 bits for now) and see if it fits.
0 Kudos
Altera_Forum
Honored Contributor II
1,595 Views

Great help, as always, Rysc. Thanks! 

 

 

--- Quote Start ---  

1) SERDES of 2 results in a DDIO implementation, so that won't work. You need to go with a SERDES of 4 or higher and you'll get the dedicated logic. (Are you running your Cyclone V core at 300MHz? Do-able, but not easy.) The SERDES of /2 limitation didn't used to be a limitation because the rates you need the real SERDES logic for were so high you would need to do a higher deserialization rate anyway. There's kind of a donut hole now where in theory you could be required to use SERDES, yet your fabric is fast enough it could handle /2 internally. It doesn't happen often, but I recommend just going down to /4 and immediately muxing it back up to /2 data rate. 

--- Quote End ---  

 

 

I'm actually already splitting my datastream by 4x a little further down the line, because I discovered early in the design process that only the simplest of logic designs could meet timing at 300MHz. I guess I'll just move that split right up to the output of the SERDES. 

 

 

--- Quote Start ---  

 

 

2) Yes, but I believe they're available on all dedicated LVDS receivers, so if your board is using the LVDS receivers, the SERDES logic should be there. (It can be a little more complicated than that, i.e. the SERDES clock can't drive LVDS receivers on opposite sides of the die, but your board probably isn't laid out like that anyway). Throw it down in your design, put a deserialization of /4(you don't even have to hook up the extra 2 bits for now) and see if it fits. 

--- Quote End ---  

 

 

Unfortunately, quartus II gives me the following error with the SERDES for the current pin assignment: 

 

Error (175001): Could not place PLL LVDS output 

Info (175028): The PLL LVDS output name: DDR_PLL2:DDR_PLL2_0|DDR_PLL2_0002:ddr_pll2_inst|altera_pll:altera_pll_i|general[0].gpll~PLL_LVDS_OUTPUT 

Error (175006): Could not find path between the PLL LVDS output and destination pin 

Info (175027): Destination: pin ADC_D[1] 

Info (175015): The I/O pad is constrained to the location PIN_K12 due to: User Location Constraints (PIN_K12) 

 

I've attached a screenshot from the pin planner, where I've highlighted the pins used in the source synchronous interface. The ADC DDR inputs to the FPGA are the ones closer to the upper left corner (use DIFFIO_RX_*_p and n), and the ssync clock is the differential pair closer to the bottom right (uses CLK5p and n) . I'm guessing the problem is that these are so far apart. 

 

For the current board revision, is there a way to get the clock up to the SERDES blocks for the data bus? The desire here is to debug the rest of the board/firmware as much as possible before the next board revision. 

 

For the next board revision, I'll still need the ssync clock to end up on one of the pins mapped to the HSMC port of the Terasic SoCKit I'm using. Would the best strategy be to try compiling with a few different pin locations for the ssync clock signal, and just see if quartus can fit it? Maybe there's a better/faster/more sophisticated strategy that can be recommended. 

 

Thanks!
0 Kudos
Altera_Forum
Honored Contributor II
1,595 Views

Yeah, that's the problem. To get the high-speeds, a low-skew dedicated clock goes from the PLL to the I/O. It doesn't drive all I/O(but there are multiple PLLs that can do this, so as long as they're close to the I/O it should work). 

 

I was thinking of trying to cascade the PLLs or something like that, but to be honest, if this is just a test environment, going back to DDR and a design that fails timing(and probably pretty badly) will still probably work in hardware. Since the board's laid out, that's what I would try first.
0 Kudos
Reply