Solved: Re: Re:How well FPGA adjust delays based on .sdc

paw_93 · ‎04-25-2024

Hi,

I have a question regarding the way the sdc timing constraints affect the design. Imagine a situation where we have a source synchronous design (both directions). FPGA drives flip-flop on the ASIC side, then ASIC propagates back both data and clock. Simple and standard situation.

I see two ways to handle it.

1) Send data on the falling edge, receive and latch with a rising edge on the ASIC side (so that we sample data in the middle) and then send back data to FPGA on the rising edge and sample there on the falling edge. Then we constrain the design even with a virtual clock, we kind of know that FPGA will receive data on a proper moment and sample it in the middle.

create_clock -name clkin -period 100MHz [get_ports {clkin}] -waveform {0ns 5ns}
set_clock_uncertainty -from clkin 0.1
set_input_delay -max -clock clkin [expr {$someCalcMax1}] [get_ports {datain}]
set_input_delay -min -clock clkin [expr {$someCalcMin1}] [get_ports {datain}]
set_output_delay -max -clock clkout -clock_fall [expr {$someCalkMax2}] [get_ports {dataout}]
set_output_delay -min -clock clkout -clock_fall [expr {$someCalkMin2}] [get_ports {dataout}]

2) Second way. We always make a rising-edge latching, on both FPGA and ASIC side, but we need to constrain it properly. Making source synchronous constraining from FPGA to ASIC and ASIC to FPGA separately might be not enough if the paths on PCB/ASIC are not well balanced (so that data doesn't come in the same moment when clock changes or there is some clock skew/slew rate problem). So we can add dependency between the FPGA-ASIC clock (clockout) and clockin - jitter, delay, then latency for the data path.

2.1 ) Full path constrain

set interface_clk "*|altera_iopll_i|*_pll|outclk[0]"

# Clock delay on PCB trace:
set T_IF_CLK_PCBmin 0.3
set T_IF_CLK_PCBmax 0.8

# Data delays on PCB trace
set T_IF_DTA_PCBmin 0.2
set T_IF_DTA_PCBmax 0.5

create_generated_clock [get_ports {clkout}] -name OUT_IF_CLK -divide_by 1 -source $interface_clk
create_generated_clock [get_ports {clkin}] -name IN_IF_CLK -divide_by 1 -source [get_ports {clkout}] # should indicate it is a clock derived from OUT_IF_CLK

# Adds latency caused by PCB to clkout initial latency and so it makes up clkin latency
set_clock_latency [get_clocks IN_IF_CLK] -source -late $T_IF_CLK_PCBmax
set_clock_latency [get_clocks IN_IF_CLK] -source -early $T_IF_CLK_PCBmin

set_clock_uncertainty -from [get_clocks IN_IF_CLK] 0.01

set_input_delay -max -clock OUT_IF_CLK [expr {$T_IF_COmax + $T_IF_DTA_PCBmax + $T_IF_CLK_PCBmax}] [get_ports {datain}]
set_input_delay -min -clock OUT_IF_CLK [expr {$T_IF_COmin + $T_IF_DTA_PCBmin + $T_IF_CLK_PCBmin}] [get_ports {datain}]

Let me know if there is something wrong with these constraints.

2.2 ) Constrain only the input interface (cutting bounds with FPGA as a source of internal ASIC clock)

create_clock -name clkin -period 100MHz [get_ports {clkin}] -waveform {0ns 5ns}
set_clock_uncertainty -from clkin 0.1
set_input_delay -max -clock clkin [expr {$someCalcMax1}] [get_ports {datain}]
set_input_delay -min -clock clkin [expr {$someCalcMin1}] [get_ports {datain}]
set_output_delay -max -clock clkout [expr {$someCalkMax2}] [get_ports {dataout}]
set_output_delay -min -clock clkout [expr {$someCalkMin2}] [get_ports {dataout}]

The questions are:

1) If we take the second way, will FPGA compensate for needed latencies (even if we use fast input/fast output registers)? How? If PCB traces are badly aligned or clk jitter is very high will FPGA dev tools try to compensate/optimise?

2) Since I had to struggle with the first approach - what are pros of that (I see only latency, its half cycle transfer so may affect the timing negatively)? Is it a better way to constrain when we don't know the details about latencies in the external device? Does it really enforce sampling in the middle of data (in combination with vhdl, where proper latching is ensured by falling_edge instruction)?

3) are fast input/fast output registers subject to any timing adjustment? From Intel documentation "to lock the input register in the LAB adjacent to the I/O cell feeding it". Should I understand it will not add delay if needed?

4) In case of second approach, should I disable fast input/fast ouput register assignment for better alignment?

5) In cyclone V one could control delay (in ps) for many pins, in Cyclone 10 it is no more possible dynamically, one has to use assignments and resynthesize. So there was a way to fine tune the interfaces, now it is no more as easy. The question that comes to my mind - should we think of timing constraints as the ones that will do the job so that the programmable delay is no more needed? Do they force the fitting tool to make things done as needed (e.g. by adding longer traces in FPGA) or they are just a recommendation and if a given placement made a particular path that violates timing we need to read the report and add intervene?

The frequency is not super high, single ended signal.

RichardTanSY_Intel · ‎05-06-2024

Perhaps sstrell could help to clarify more on your first question.

Anyhow, we uses one of the two methods in deriving the input and output constraints for the SS.
From what I see in the diagram, you may consider using the system-centric method.

By specifying the appropriate clock waveform using the -waveform option, you can effectively align the clock edges to accommodate the data transfer requirements. Whether you choose center alignment or edge alignment depends on your specific design needs.

You may checkout the "Clock and Data Relationship" section in AN433 and below screenshot.

Regards,

Richard Tan

View solution in original post

sstrell · ‎04-25-2024

To paraphrase, "There's an app [note] for that" as well as a training:

https://www.intel.com/content/www/us/en/content-details/653688/an-433-constraining-and-analyzing-source-synchronous-interfaces.html

https://learning.intel.com/developer/learn/courses/168/constraining-source-synchronous-interfaces

Answering your questions:

1) Clock latency constraints and generated clock for input clock are not needed. Just use create_clock for the virtual clock driven by the external device (launch edge) and for the clock arriving at the FPGA (latch edge), either center aligned or edge aligned using the -waveform option of create_clock. set_input_delay will refer to the virtual clock to select the launch edge clock and calculated max/min values will be based off info from external device (tco, setup/hold requirements, or skew). If tco is used, trace delay differences can be included in the calculation.

2) Method 1 is not really correct for constraining an SS interface.

3) Better idea would be to use the DDIO registers. Even if this is single data rate, you can tie the low data input to the high data input. The DDIO registers provide matching delay for the clock and data for both inputs and outputs.

4) Not relevant if DDIO registers are used.

5) Timing constraints guide the Fitter to meet the timing requirements you specify. Fine tuning may be possible in hardware for particular devices as you note, but it's not normally required. The constraints handle everything.

RichardTanSY_Intel · ‎04-28-2024

I believe sstrell have answered your inquiries.

Do you need any further assistance in regards to this case?

Regards,

Richard Tan

paw_93 · ‎05-02-2024

Hi Richard and sstrell,

Thank you for your answers. I was just trying to figure out the 2) answer. Why the first way is incorrect? It's incorrect in the sense that the .sdc constraints are wrong or the general approach with rising-to-falling transition is wrong? In the end we have 2 source synchronous interfaces. I know I forgot to add the declaration of virtual clock clkin.

Is 1) actually an answer to the above mentioned question? Do I understand correctly, the proposed center alignment information added in the clock declaration (-waveform ) should eliminate the clock_fall attribute. In the end the information about falling edge aligned data is synonymous to center rising edge aligned data.

RichardTanSY_Intel · ‎05-06-2024

Perhaps sstrell could help to clarify more on your first question.

Anyhow, we uses one of the two methods in deriving the input and output constraints for the SS.
From what I see in the diagram, you may consider using the system-centric method.

By specifying the appropriate clock waveform using the -waveform option, you can effectively align the clock edges to accommodate the data transfer requirements. Whether you choose center alignment or edge alignment depends on your specific design needs.

You may checkout the "Clock and Data Relationship" section in AN433 and below screenshot.

Regards,

Richard Tan

RichardTanSY_Intel · ‎05-09-2024

Dropping a note to ask if my last reply was helpful to you?

Do you need any further assistance from my side?

Regards,

Richard Tan

RichardTanSY_Intel · ‎05-13-2024

We noticed that we haven't received a response from you regarding the latest previous question/reply/answer, and will now transitioning your inquiry to our community support. We apologize for any inconvenience this may cause and we appreciate your understanding.

If you have any further questions or concerns, please don't hesitate to let us know.

Thank you for reaching out to us!

Best Regards,

Richard Tan

paw_93 · ‎05-13-2024

Hi Richard,

Thank you for your answer. Yes, that answer was helpful. From that I conclude that Intel prefers to use -waveform option instead of -clock_fall in that specific scenarios.

Have a nice day!

How well FPGA adjust delays based on .sdc

Board|Test Design|Layout

Configuration (FPGA)

I|O

Source Synchronous SERDES