Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
17267 Discussions

Verilog code fails timing analysis for register within same clock domain

Altera_Forum
Honored Contributor II
2,988 Views

Hello-- 

 

I've written some code in Verilog to sample data from six ADCs. Some signals in the code must cross clock domains. I have used a synchronizer to ensure that the signals are effectively propagated. 

 

Essentially what I am doing is grabbing data from all of the ADCs at the same time, and then using a system of flags to offload the data to another module. The data that is being offloaded in the code below is "adc_data." 

 

However, although the code is failing slow classical timing analysis in Quartus II, the code passes fast classical timing analysis. 

 

I am using a 30MHz external clock, a 70MHz clock generated from the 30MHz by a PLL, and a 280MHz clock also generated by the same PLL. 

 

Strangely enough, the timing analysis is failing for a register that appears to be in the same clock domain: 

 

Info: Slack time is -2.175 ns for clock "my_pll:pll|altpll:altpll_component|_clk1" between source register "signal:my_signal|adc:my_adc|ccc" and destination register "signal:my_signal|adc:my_adc|adc_data" Info: Fmax is 174.03 MHz (period= 5.746 ns) Info: + Largest register to register requirement is 3.311 ns Info: + Setup relationship between source and destination is 3.571 ns Info: + Latch edge is 3.571 ns Info: Clock period of Destination clock "my_pll:pll|altpll:altpll_component|_clk1" is 3.571 ns with offset of 0.000 ns and duty cycle of 50 Info: Clock offset from Destination is based on specified offset of 0.000 ns and phase shift of 0.000 degrees of the derived clock Info: Multicycle Setup factor for Destination register is 1 Info: - Launch edge is 0.000 ns Info: Clock period of Source clock "my_pll:pll|altpll:altpll_component|_clk1" is 3.571 ns with offset of 0.000 ns and duty cycle of 50 Info: Clock offset from Source is based on specified offset of 0.000 ns and phase shift of 0.000 degrees of the derived clock Info: Multicycle Setup factor for Source register is 1 Info: + Largest clock skew is 0.004 ns Info: + Shortest clock path from clock "my_pll:pll|altpll:altpll_component|_clk1" to destination register is 2.515 ns Info: 1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. = PLL_1; Fanout = 1; CLK Node = 'my_pll:pll|altpll:altpll_component|_clk1' Info: 2: + IC(0.916 ns) + CELL(0.000 ns) = 0.916 ns; Loc. = CLKCTRL_G2; Fanout = 126; COMB Node = 'my_pll:pll|altpll:altpll_component|_clk1~clkctrl' Info: 3: + IC(0.933 ns) + CELL(0.666 ns) = 2.515 ns; Loc. = LCFF_X14_Y4_N15; Fanout = 2; REG Node = 'signal:my_signal|adc:my_adc|adc_data' Info: Total cell delay = 0.666 ns ( 26.48 % ) Info: Total interconnect delay = 1.849 ns ( 73.52 % ) Info: - Longest clock path from clock "my_pll:pll|altpll:altpll_component|_clk1" to source register is 2.511 ns Info: 1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. = PLL_1; Fanout = 1; CLK Node = 'my_pll:pll|altpll:altpll_component|_clk1' Info: 2: + IC(0.916 ns) + CELL(0.000 ns) = 0.916 ns; Loc. = CLKCTRL_G2; Fanout = 126; COMB Node = 'my_pll:pll|altpll:altpll_component|_clk1~clkctrl' Info: 3: + IC(0.929 ns) + CELL(0.666 ns) = 2.511 ns; Loc. = LCFF_X16_Y5_N3; Fanout = 26; REG Node = 'signal:my_signal|adc:my_adc|ccc' Info: Total cell delay = 0.666 ns ( 26.52 % ) Info: Total interconnect delay = 1.845 ns ( 73.48 % ) Info: - Micro clock to output delay of source is 0.304 ns Info: - Micro setup delay of destination is -0.040 ns Info: - Longest register to register delay is 5.486 ns Info: 1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. = LCFF_X16_Y5_N3; Fanout = 26; REG Node = 'signal:my_signal|adc:my_adc|ccc' Info: 2: + IC(0.474 ns) + CELL(0.623 ns) = 1.097 ns; Loc. = LCCOMB_X16_Y5_N20; Fanout = 26; COMB Node = 'signal:my_signal|adc:my_adc|Add2~3' Info: 3: + IC(1.892 ns) + CELL(0.624 ns) = 3.613 ns; Loc. = LCCOMB_X15_Y3_N8; Fanout = 4; COMB Node = 'signal:my_signal|adc:my_adc|ShiftLeft0~30' Info: 4: + IC(1.141 ns) + CELL(0.624 ns) = 5.378 ns; Loc. = LCCOMB_X14_Y4_N14; Fanout = 1; COMB Node = 'signal:my_signal|adc:my_adc|adc_data~141' Info: 5: + IC(0.000 ns) + CELL(0.108 ns) = 5.486 ns; Loc. = LCFF_X14_Y4_N15; Fanout = 2; REG Node = 'signal:my_signal|adc:my_adc|adc_data' Info: Total cell delay = 1.979 ns ( 36.07 % ) Info: Total interconnect delay = 3.507 ns ( 63.93 % )What could I do to improve timing analysis? Here is the code for my module: 

 

module adc( // inputs rst, set, clk, hs_clk, vhs_clk, sc_enable, miso, rs_offload_flag, // outputs tr_adc, sclk_adc, adc_data, offload_flag ); // Inputs input rst; // reset input set; // helps to ensure state input clk; // main clk for the system (30 MHz) input hs_clk; // High speed 70 MHz clock for ADCs input vhs_clk; // Very high speed clock for sampling SPI (280MHz) input sc_enable; // samples will occcur on sample_clk input miso; // six (6) inputs for each ADC input rs_offload_flag; // resets the offload_flag // Outputs output tr_adc; // used to trigger ADCs output sclk_adc; // sclk line for the ADCs // data from the ADCs output adc_data; // this is the freeze register output offload_flag; // Data offloaded from the ADCs // Regs reg tr_adc; // six (6) triggers for each ADC reg sample_flag; // flag to indicate the beginning of each sample // ADC data register holds the output from the ADCs reg adc_data; reg offload_flag = 0; reg ccc; // ccc = 20, so this is the maximum number reg sclk_adc; // Flags reg collect_data = 0; // Shift registers reg miso_sr; reg adc_sclk_sr; // determine the falling edge of MISO // note the use of the reduction OR | operator // which is used to check if all of MISO from each // ADC is low always @(posedge vhs_clk) miso_sr <= { miso_sr, |miso }; wire falling_edge_miso = (miso_sr == 2'b10); // determine the rising and falling edges of sclk_adc always @(posedge vhs_clk) adc_sclk_sr <= { adc_sclk_sr, hs_clk }; wire rising_edge_sclk_adc = (adc_sclk_sr == 2'b01); wire falling_edge_sclk_adc = (adc_sclk_sr == 2'b10); // initiate the conversion by toggling tr_adc always @(posedge clk) begin if( rst || !set ) begin sample_flag <= 1'b0; tr_adc <= 5'b0; end else begin if(sc_enable && !sample_flag) begin tr_adc <= 6'b111111; sample_flag <= 1'b1; end else begin tr_adc <= 6'b0; end if(!sc_enable) begin sample_flag <= 1'b0; tr_adc <= 6'b0; end end end // these signals are cascaded to ensure that we pass timing analysis reg rloc_set; Signal_CrossDomain my_first_crossdomain( .clkA(clk), .SignalIn(set), .clkB(hs_clk), .SignalOut(rloc_set) ); wire loc_set; Signal_CrossDomain my_second_crossdomain( .clkA(hs_clk), .SignalIn(rloc_set), .clkB(vhs_clk), .SignalOut(loc_set) ); wire loc_rs_offload_flag; Signal_CrossDomain my_third_crossdomain( .clkA(hs_clk), .SignalIn(rs_offload_flag), .clkB(vhs_clk), .SignalOut(loc_rs_offload_flag) ); // sample all of the ADCs at the same time always @(posedge vhs_clk) begin if( (rst || !loc_set) ) begin collect_data <= 0; end else begin // wait until the falling edge of miso if(falling_edge_miso && !collect_data) begin collect_data <= 1; end if(collect_data) begin if(rising_edge_sclk_adc) begin sclk_adc <= 6'b111111; ccc <= ccc + 1'b1; if( ccc >= 1 ) adc_data <= (adc_data | (miso << (6 * (ccc-1) ))); end if(falling_edge_sclk_adc) begin sclk_adc <= 6'b000000; if(ccc == 19) begin collect_data <= 0; offload_flag <= 1; end end end // collect_data // clear the offload_flag and the adc_data in preparation // for the next measurement if( loc_rs_offload_flag ) begin offload_flag <= 0; adc_data <= 0; ccc <= 0; end end //else end // posedge vhs_clk // ends the module endmodule
0 Kudos
11 Replies
Altera_Forum
Honored Contributor II
1,439 Views

The timing violations apparently hasn't to do with domain crossing. 

 

Simply, the below logic doesn't run at 280 MHz. The timing analyzer calculates Fmax of 174 MHz for it. 

 

--- Quote Start ---  

adc_data <= (adc_data | (miso << (6 * (ccc-1) ))); 

--- Quote End ---  

 

 

Generally, when acquiring high speed serial data, e.g from ADC with LVDS interface, you should have a suitable application structure. Run a simple, low complex logic deserializer at the fast (bit) clock and everything else at the slow (frame) clock. Quartus altlvds receiver can be usable too, but the required word length may be unsupported. 

 

As another point, you can utilize dual data rate input registers to run the locgic at the half bit clock, that's what all software LVDS receivers do. 

 

If you would need to run a similar logic at the fast clock for some reason, you should not use variable shifts like << (6 * (ccc-1)). They are infering a large multiplexer which involves multiple LEs for each destination bit. Very slow.
0 Kudos
Altera_Forum
Honored Contributor II
1,439 Views

Hello FvM-- 

 

Thank you so much for your response! The logic deserializer is a good idea, and it is also nice to know that the variable shifts are slow, so that they should not be used with such a fast clock. 

 

So thank you!
0 Kudos
Altera_Forum
Honored Contributor II
1,439 Views

Besides looking at the timing analyzer results, it's always instructive to check the psysical mapping in the netlist viewer.

0 Kudos
Altera_Forum
Honored Contributor II
1,439 Views

Will do - it is kind of neat to see the physical mapping as compared to the code. 

 

How might it be possible to re-write the variable shift << (6 * (ccc-1)) so that the multiplexer is not inferred? Could concatenation {} be used here instead? 

 

 

0 Kudos
Altera_Forum
Honored Contributor II
1,439 Views

What I did was replace the code containing the variable shifts with the following: 

 

if(ccc >= 1) begin case(ccc) 1: adc_data <= miso; 2: adc_data <= miso; 3: adc_data <= miso; 4: adc_data <= miso; 5: adc_data <= miso; 6: adc_data <= miso; 7: adc_data <= miso; 8: adc_data <= miso; 9: adc_data <= miso; 10: adc_data <= miso; 11: adc_data <= miso; 12: adc_data <= miso; 13: adc_data <= miso; 14: adc_data <= miso; 15: adc_data <= miso; 16: adc_data <= miso; 17: adc_data <= miso; 18: adc_data <= miso; endcase end 

 

This fixes the problem! However, timing analysis now fails for another variable "collect_data": 

 

 

Info: Slack time is -1.363 ns for clock "my_pll:pll|altpll:altpll_component|_clk1" between source register "signal:my_signal|adc:my_adc|collect_data" and destination register "signal:my_signal|adc:my_adc|adc_data" Info: Fmax is 202.68 MHz (period= 4.934 ns) Info: + Largest register to register requirement is 3.283 ns Info: + Setup relationship between source and destination is 3.571 ns Info: + Latch edge is 3.571 ns Info: Clock period of Destination clock "my_pll:pll|altpll:altpll_component|_clk1" is 3.571 ns with offset of 0.000 ns and duty cycle of 50 Info: Clock offset from Destination is based on specified offset of 0.000 ns and phase shift of 0.000 degrees of the derived clock Info: Multicycle Setup factor for Destination register is 1 Info: - Launch edge is 0.000 ns Info: Clock period of Source clock "my_pll:pll|altpll:altpll_component|_clk1" is 3.571 ns with offset of 0.000 ns and duty cycle of 50 Info: Clock offset from Source is based on specified offset of 0.000 ns and phase shift of 0.000 degrees of the derived clock Info: Multicycle Setup factor for Source register is 1 Info: + Largest clock skew is -0.024 ns Info: + Shortest clock path from clock "my_pll:pll|altpll:altpll_component|_clk1" to destination register is 2.452 ns Info: 1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. = PLL_1; Fanout = 1; CLK Node = 'my_pll:pll|altpll:altpll_component|_clk1' Info: 2: + IC(0.916 ns) + CELL(0.000 ns) = 0.916 ns; Loc. = CLKCTRL_G2; Fanout = 126; COMB Node = 'my_pll:pll|altpll:altpll_component|_clk1~clkctrl' Info: 3: + IC(0.870 ns) + CELL(0.666 ns) = 2.452 ns; Loc. = LCFF_X22_Y9_N21; Fanout = 1; REG Node = 'signal:my_signal|adc:my_adc|adc_data' Info: Total cell delay = 0.666 ns ( 27.16 % ) Info: Total interconnect delay = 1.786 ns ( 72.84 % ) Info: - Longest clock path from clock "my_pll:pll|altpll:altpll_component|_clk1" to source register is 2.476 ns Info: 1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. = PLL_1; Fanout = 1; CLK Node = 'my_pll:pll|altpll:altpll_component|_clk1' Info: 2: + IC(0.916 ns) + CELL(0.000 ns) = 0.916 ns; Loc. = CLKCTRL_G2; Fanout = 126; COMB Node = 'my_pll:pll|altpll:altpll_component|_clk1~clkctrl' Info: 3: + IC(0.894 ns) + CELL(0.666 ns) = 2.476 ns; Loc. = LCFF_X21_Y12_N21; Fanout = 7; REG Node = 'signal:my_signal|adc:my_adc|collect_data' Info: Total cell delay = 0.666 ns ( 26.90 % ) Info: Total interconnect delay = 1.810 ns ( 73.10 % ) Info: - Micro clock to output delay of source is 0.304 ns Info: - Micro setup delay of destination is -0.040 ns Info: - Longest register to register delay is 4.646 ns Info: 1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. = LCFF_X21_Y12_N21; Fanout = 7; REG Node = 'signal:my_signal|adc:my_adc|collect_data' Info: 2: + IC(1.107 ns) + CELL(0.623 ns) = 1.730 ns; Loc. = LCCOMB_X23_Y12_N10; Fanout = 12; COMB Node = 'signal:my_signal|adc:my_adc|adc_data~4' Info: 3: + IC(1.539 ns) + CELL(0.206 ns) = 3.475 ns; Loc. = LCCOMB_X22_Y9_N8; Fanout = 6; COMB Node = 'signal:my_signal|adc:my_adc|adc_data~9' Info: 4: + IC(0.316 ns) + CELL(0.855 ns) = 4.646 ns; Loc. = LCFF_X22_Y9_N21; Fanout = 1; REG Node = 'signal:my_signal|adc:my_adc|adc_data' Info: Total cell delay = 1.684 ns ( 36.25 % ) Info: Total interconnect delay = 2.962 ns ( 63.75 % ) 

 

 

 

 

 

 

 

 

What would I have to change to ensure that this code now passes timing analysis?
0 Kudos
Altera_Forum
Honored Contributor II
1,439 Views

I think that I've now managed to fix this particular timing problem. Removing the "collect_data" signal from this block seems to make the timing problem go away: 

 

if( falling_edge_sclk_adc ) begin sclk_adc <= 6'b000000; if(ccc == 19) begin //collect_data <= 0; offload_flag <= 1; end end  

 

The following code is added at the bottom of the always block: 

 

 

if(offload_flag) collect_data <= 0;  

 

 

 

Now another timing problem arises between the "adc_sclk_sr" and "adc_data": 

 

 

Info: Slack time is -1.203 ns for clock "my_pll:pll|altpll:altpll_component|_clk1" between source register "signal:my_signal|adc:my_adc|adc_sclk_sr" and destination register "signal:my_signal|adc:my_adc|adc_data" Info: Fmax is 209.47 MHz (period= 4.774 ns) Info: + Largest register to register requirement is 3.321 ns Info: + Setup relationship between source and destination is 3.571 ns Info: + Latch edge is 3.571 ns Info: Clock period of Destination clock "my_pll:pll|altpll:altpll_component|_clk1" is 3.571 ns with offset of 0.000 ns and duty cycle of 50 Info: Clock offset from Destination is based on specified offset of 0.000 ns and phase shift of 0.000 degrees of the derived clock Info: Multicycle Setup factor for Destination register is 1 Info: - Launch edge is 0.000 ns Info: Clock period of Source clock "my_pll:pll|altpll:altpll_component|_clk1" is 3.571 ns with offset of 0.000 ns and duty cycle of 50 Info: Clock offset from Source is based on specified offset of 0.000 ns and phase shift of 0.000 degrees of the derived clock Info: Multicycle Setup factor for Source register is 1 Info: + Largest clock skew is 0.014 ns Info: + Shortest clock path from clock "my_pll:pll|altpll:altpll_component|_clk1" to destination register is 2.486 ns Info: 1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. = PLL_1; Fanout = 1; CLK Node = 'my_pll:pll|altpll:altpll_component|_clk1' Info: 2: + IC(0.916 ns) + CELL(0.000 ns) = 0.916 ns; Loc. = CLKCTRL_G2; Fanout = 126; COMB Node = 'my_pll:pll|altpll:altpll_component|_clk1~clkctrl' Info: 3: + IC(0.904 ns) + CELL(0.666 ns) = 2.486 ns; Loc. = LCFF_X21_Y6_N13; Fanout = 1; REG Node = 'signal:my_signal|adc:my_adc|adc_data' Info: Total cell delay = 0.666 ns ( 26.79 % ) Info: Total interconnect delay = 1.820 ns ( 73.21 % ) Info: - Longest clock path from clock "my_pll:pll|altpll:altpll_component|_clk1" to source register is 2.472 ns Info: 1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. = PLL_1; Fanout = 1; CLK Node = 'my_pll:pll|altpll:altpll_component|_clk1' Info: 2: + IC(0.916 ns) + CELL(0.000 ns) = 0.916 ns; Loc. = CLKCTRL_G2; Fanout = 126; COMB Node = 'my_pll:pll|altpll:altpll_component|_clk1~clkctrl' Info: 3: + IC(0.890 ns) + CELL(0.666 ns) = 2.472 ns; Loc. = LCFF_X24_Y8_N25; Fanout = 5; REG Node = 'signal:my_signal|adc:my_adc|adc_sclk_sr' Info: Total cell delay = 0.666 ns ( 26.94 % ) Info: Total interconnect delay = 1.806 ns ( 73.06 % ) Info: - Micro clock to output delay of source is 0.304 ns Info: - Micro setup delay of destination is -0.040 ns Info: - Longest register to register delay is 4.524 ns Info: 1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. = LCFF_X24_Y8_N25; Fanout = 5; REG Node = 'signal:my_signal|adc:my_adc|adc_sclk_sr' Info: 2: + IC(0.464 ns) + CELL(0.370 ns) = 0.834 ns; Loc. = LCCOMB_X24_Y8_N22; Fanout = 3; COMB Node = 'signal:my_signal|adc:my_adc|Equal1~0' Info: 3: + IC(1.015 ns) + CELL(0.206 ns) = 2.055 ns; Loc. = LCCOMB_X22_Y8_N28; Fanout = 3; COMB Node = 'signal:my_signal|adc:my_adc|adc_data~1' Info: 4: + IC(1.093 ns) + CELL(0.206 ns) = 3.354 ns; Loc. = LCCOMB_X21_Y6_N24; Fanout = 6; COMB Node = 'signal:my_signal|adc:my_adc|adc_data~20' Info: 5: + IC(0.315 ns) + CELL(0.855 ns) = 4.524 ns; Loc. = LCFF_X21_Y6_N13; Fanout = 1; REG Node = 'signal:my_signal|adc:my_adc|adc_data' Info: Total cell delay = 1.637 ns ( 36.18 % ) Info: Total interconnect delay = 2.887 ns ( 63.82 % )  

 

 

What would I have to do to fix this particular timing analysis problem?
0 Kudos
Altera_Forum
Honored Contributor II
1,439 Views

There now seems to be timing issues associated with two registers: "collect_data" and "ccc", both of which affect paths linked to "adc_data." 

 

Why are these registers failing timing analysis, and what can I do to fix this problem?
0 Kudos
Altera_Forum
Honored Contributor II
1,439 Views

I would wonder if this particular code is to blame: 

 

if(ccc >= 1) begin case(ccc) 1: adc_data <= miso; 2: adc_data <= miso; 3: adc_data <= miso; 4: adc_data <= miso; 5: adc_data <= miso; 6: adc_data <= miso; 7: adc_data <= miso; 8: adc_data <= miso; 9: adc_data <= miso; 10: adc_data <= miso; 11: adc_data <= miso; 12: adc_data <= miso; 13: adc_data <= miso; 14: adc_data <= miso; 15: adc_data <= miso; 16: adc_data <= miso; 17: adc_data <= miso; 18: adc_data <= miso; endcase  

 

Is there a way to re-write this code using the Verilog concatenation operator?
0 Kudos
Altera_Forum
Honored Contributor II
1,439 Views

I think, a shift register for adc_data would involve less complex logic. I also wonder, if part of the processing could be performed at the slow (70 MHz) clock.

0 Kudos
Altera_Forum
Honored Contributor II
1,439 Views

Yes - you are completely right FvM! Reducing the processing speed of some of the logic does indeed help. I changed the PLL which generates "vhs_clk." So rather than run "vhs_clk" at 4x the 70MHz clock, I am now running "vhs_clk" at 140MHz. I don't know if this is sufficient enough to sample the SPI bus running at 70MHz, but the code does indeed pass slow and fast classical timing analysis. 

 

Thank you once again for your help!
0 Kudos
Altera_Forum
Honored Contributor II
1,439 Views

For my specific application, I had to modify things a little bit more before I could reliably read from SPI. 

 

I changed the PLL so that the SPI ADCs were clocked at 60MHz, and the SPI MISO bus was sampled at 180MHz, which is 3x the speed of 60MHz. Running the design with these clocks allowed for timing analysis to be met. 

 

Once again, thank you for your help, FvM!
0 Kudos
Reply