Success! Subscription added.
Success! Subscription removed.
Sorry, you must verify to complete this action. Please click the verification link in your email. You may re-send via your profile.
This page is dedicated to users that are seeing poor timing closure performance on Cyclone V, Arria V, and Stratix V DDR3 interfaces related to half rate core to periphery (c2p) registers, DDR address command, and DDR DQS vs CK. All of these violations will be shown when performing a Report DDR in Timequest or viewing the Timing Analyzer results in Quartus. The violations prevent achieving the highest specified DDR3 interface rate as shown in our EMIF specification estimator and the FPGA family datasheet.
External Memory Interface (EMIF) Spec Estimator - Intel
The violations are not controllable by user RTL and require manual placement of half rate registers and post ECO D5 delay value changes to achieve timing closure.
This article will list common violations and provide solutions to meet expected DDR3 EMIF interface timing.
Examples of timing violations
*ureset_n_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFHIO
*ureset_n_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFLO
*uras_n_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFHIO
*uras_n_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFLO
*ucas_n_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFHIO
*ucas_n_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFLO
*ubank_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFHIO
*ubank_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFLO
*ucs_n_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFHIO
*ucs_n_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFLO
*uodt_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFHIO
*uodt_qr_to_hr *dataout_r[*][*] to ddio_outa[*]~DFFLO
*ucke_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFHIO
*ucke_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFLO
*uwe_n_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFHIO
*uwe_n_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFLO
*uaddress_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFHIO
*uaddress_qr_to_hr*dataout_r[*][*] to ddio_outa[*]~DFFLO
Which will be from the half rate clock, pll_hr_clk, to the phase shifted half rate clock pll_addr_cmd_clk.
*|pll0|pll_hr_clk to *|pll0|pll_addr_cmd_clk
DDR address command negative setup or hold slack that may look similar to the following:
DDR DQS vs CK negative setup or hold slack that may look similar to the following:
Quartus interpretation of *dataout_r[*][*] to ddio_outa[*]~DFFHIO register timing relationship
It seems that Quartus is not properly recognizing the phase relationship between the half rate clock, pll_hr_clk and the phase shifted half rate clock pll_addr_cmd_clk. Even though the clocks are synchronous, it is as if Quartus does not make enough effort to place the source dataout_r registers close enough to the periphery IO register.
Quartus interpretation of Address Command and DQS vs CK timing
It seems that Quartus is not properly setting the D5 Delay ideally within the IO cell. The balance between setup and hold is not properly being managed. In most cases, there is enough hold slack for a setup violation to reduce the D5 delay or enough setup slack for a hold violation to increase the D5 delay. Quartus is not choosing the correct D5 delay value.
Firstly, the phase relationship between the half rate clock, pll_hr_clk, and the phase shifted half rate clock pll_addr_cmd_clk make it very challenging to meet timing. By default, the DDR3 IP generates a 225 degree phase shifted half rate clock for the pll_addr_cmd_clk. The pll_hr_clk is not phase shifted, so the pll_hr_clk relationship to the pll_addr_cmd_clk is such that the max setup is roughly 5/8s the period of the half rate clock. So, if the half rate clock is 250Mhz, the max setup is roughly (1/250Mhz * 5/8) = 2.5ns. To gain more setup time for the long interconnect (IC) delay, we can also phase shift the pll_hr_clk. In many cases, adding 315 degrees of phase shift will be appropriate which effectively will give another 1/8 max setup, or another 250ps to try and meet a challenging setup window. However, there is no option in the DDR3 IP generation to change the phase shift of the pll_hr_clk. The changes need to be made manually in two of the IP generated files.
<DDR3_IP_name>_p0_parameters.tcl
<DDR3_IP_name>_pll0.sv
Open <DDR3_IP_name>_p0_parameters.tcl and update the p0_pll_phase for the following two lines:
set ::GLOBAL_<DDR3_IP_name>_p0_pll_phase(7) 0.0
set ::GLOBAL_<DDR3_IP_name>_p0_pll_phase(PLL_HR_CLK) 0.0
Change to the following for a 315 degree phase shift.
set ::GLOBAL_<DDR3_IP_name>_p0_pll_phase(7) 315.0
set ::GLOBAL_<DDR3_IP_name>_p0_pll_phase(PLL_HR_CLK) 315.0
Open <DDR3_IP_name>_pll0.sv and update the p0_pll_phase for the following line:
parameter HR_CLK_PHASE = "0 ps";
Update to reflect a 315 degree phase shift, in ps. So, for 250Mhz (1/250Mhz * 7/8) = 3500ps.
parameter HR_CLK_PHASE = "3500 ps";
If you need to re-generate the DDR3 IP, the above files will be overwritten and you will need to manually make the changes again.
The above change may be enough to meet timing on the core *dataout_r* to periphery IO registers.
If Quartus is still not able to easily close timing, it may be required to do logic locks and place the *dataout_r* registers as close as possible to the periphery IO register.
Use Quartus Logic Lock to lock down the failing or ALL *dataout_r[*][*] registers as close as possible to the EMIF pin within the periphery. The process can be painful as you may need to get down to each individual register lockdown. In many cases, multiple *uaddress_qr_to_hr*dataout_r[*][*] and control registers can be placed with the same logic lock. Here is an example of locking down specific registers close to the periphery at the top of an Arria V.
Showing the routing from the locked down registers to the periphery IO registers:
Here is an example of similar lock downs at the bottom of an Arria V:
Showing the routing from the locked down registers to the periphery IO registers:
Please see the attached DDR3_qr_to_hr_dataout_r_Logic_Lock.tcl file for an example of locking down core dataout_r[*][*] registers close to the destination periphery IO registers.
The solution for fixing Address Command and DQS vs. CK timing violations is a post compile ECO edit of the D5 delay then performing an ECO compile.
The post compile ECO will be changing the D5 delay value. Please refer to the forum article
Changing the D5_Delay value in Quartus post fit to help meet timing on external interfaces
Here are the rules to determine if you should be incrementing or decrementing the D5 delay setting:
If DQS vs. CK for ddr3 IP has negative setup slack in slow corner, than reduce D5_DELAY of ddr3_dqs_io[3:0] and ddr3_ndqs_io[3:0] by 1.
If DQS vs. CK for ddr3 IP has negative setup slack in fast corner, than increase D5_DELAY of ddr3_dqs_io[3:0] and ddr3_ndqs_io[3:0] by 1.
If Address Command for ddr3 IP has negative setup slack in slow corner, than increase D5_DELAY of ddr3_clk_o[0] and ddr3_nclk_o[0] by 1.
If Address Command for ddr3 IP has negative hold slack in fast corner, than decrease D5_DELAY of ddr3_clk_o[0] and ddr3_nclk_o[0] by 1.
If Address Command for ddr3 IP has negative hold slack in slow corner, than decrease D5_DELAY of ddr3_clk_o[0] and ddr3_nclk_o[0] by 1.
If DQS vs. CK for ddr3 IP has negative hold slack in slow corner, than increase D5_DELAY of ddr3_dqs_io[3:0] and ddr3_ndqs_io[3:0] by 1.
After each change look at the Report DDR timing report to see that the negative slack is being absorbed on the other side. i.e. If large setup negative violation the D5_Delay change should reduce or remove the setup violation and reduce the margin on the hold side. Continue to adjust the D5 delay on individual pins until all are meeting timing in Report DDR.
Community support is provided during standard business hours (Monday to Friday 7AM - 5PM PST). Other contact methods are available here.
Intel does not verify all solutions, including but not limited to any file transfers that may appear in this community. Accordingly, Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.
For more complete information about compiler optimizations, see our Optimization Notice.