Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
21615 讨论

Reduce Clock Setup and Clock Hold times

Altera_Forum
名誉分销商 II
2,261 次查看

Hi there, 

I have a state machine witch makes some math operations. The question is how can I reduce the clock setup and hold times, so I can meet my timing requirements? Here is the code of the state machine: 

module math_synthesis ( input clk20M, input a, b, output reg result_out ); //Math registers reg dataa_mult, datab_mult, dataa_add, datab_add, dataa_sub, datab_sub, denom_sig, numer_sig; wire quotient_sig, remain_sig, result_add, result_sub; wire result_mult; //State machine registers: reg currentState; //Registers initial: initial begin currentState = 4'd0; end //States: parameter STATE_1 = 4'd0, STATE_2 = 4'd1, STATE_3 = 4'd2, STATE_4 = 4'd3, STATE_5 = 4'd4, STATE_6 = 4'd5, STATE_7 = 4'd6, STATE_8 = 4'd7; always @(posedge clk20M) begin case(currentState) STATE_1: begin dataa_mult <= a; datab_mult <= 16'd977; currentState <= STATE_2; end STATE_2: begin numer_sig <= result_mult; denom_sig <= 16'd1000; currentState <= STATE_3; end STATE_3: begin dataa_mult <= quotient_sig; datab_mult <= 16'd256; currentState <= STATE_4; end STATE_4: begin dataa_sub <= result_mult; datab_sub <= 16'd25600; currentState <= STATE_5; end STATE_5: begin numer_sig <= result_sub; denom_sig <= 16'd100; currentState <= STATE_6; end STATE_6: begin result_out <= quotient_sig; currentState <= STATE_1; end endcase end math_mult math_mult_inst ( .dataa ( dataa_mult ), .datab ( datab_mult ), .result ( result_mult ) ); math_divide math_divide_inst ( .denom ( denom_sig ), .numer ( numer_sig ), .quotient ( quotient_sig ), .remain ( remain_sig ) ); math_add math_add_inst ( .dataa ( dataa_add ), .datab ( datab_add ), .result ( result_add ) ); math_sub math_sub_inst ( .dataa ( dataa_sub ), .datab ( datab_sub ), .result ( result_sub ) ); endmodule  

 

I attached the clock setup and hold times. 

If there is any technique witch I could use to improve timings, please share it :)
0 项奖励
9 回复数
Altera_Forum
名誉分销商 II
1,464 次查看

 

--- Quote Start ---  

Hi there, 

I have a state machine witch makes some math operations. The question is how can I reduce the clock setup and hold times, so I can meet my timing requirements? Here is the code of the state machine: 

module math_synthesis ( input clk20M, input a, b, output reg result_out ); //Math registers reg dataa_mult, datab_mult, dataa_add, datab_add, dataa_sub, datab_sub, denom_sig, numer_sig; wire quotient_sig, remain_sig, result_add, result_sub; wire result_mult; //State machine registers: reg currentState; //Registers initial: initial begin currentState = 4'd0; end //States: parameter STATE_1 = 4'd0, STATE_2 = 4'd1, STATE_3 = 4'd2, STATE_4 = 4'd3, STATE_5 = 4'd4, STATE_6 = 4'd5, STATE_7 = 4'd6, STATE_8 = 4'd7; always @(posedge clk20M) begin case(currentState) STATE_1: begin dataa_mult <= a; datab_mult <= 16'd977; currentState <= STATE_2; end STATE_2: begin numer_sig <= result_mult; denom_sig <= 16'd1000; currentState <= STATE_3; end STATE_3: begin dataa_mult <= quotient_sig; datab_mult <= 16'd256; currentState <= STATE_4; end STATE_4: begin dataa_sub <= result_mult; datab_sub <= 16'd25600; currentState <= STATE_5; end STATE_5: begin numer_sig <= result_sub; denom_sig <= 16'd100; currentState <= STATE_6; end STATE_6: begin result_out <= quotient_sig; currentState <= STATE_1; end endcase end math_mult math_mult_inst ( .dataa ( dataa_mult ), .datab ( datab_mult ), .result ( result_mult ) ); math_divide math_divide_inst ( .denom ( denom_sig ), .numer ( numer_sig ), .quotient ( quotient_sig ), .remain ( remain_sig ) ); math_add math_add_inst ( .dataa ( dataa_add ), .datab ( datab_add ), .result ( result_add ) ); math_sub math_sub_inst ( .dataa ( dataa_sub ), .datab ( datab_sub ), .result ( result_sub ) ); endmodule  

 

I attached the clock setup and hold times. 

If there is any technique witch I could use to improve timings, please share it :) 

--- Quote End ---  

 

 

 

 

Hi, 

 

what is your required clock speed ? How do you generate the clock (PLL ?). 

 

Kind regards 

 

GPK
0 项奖励
Altera_Forum
名誉分销商 II
1,464 次查看

Hi, 

My clock speed is 20MHz but if I want to do more complicated calculations or change the signals width it doesn't fit. It's generated by external oscillator. 

 

Best regards, 

VT
0 项奖励
Altera_Forum
名誉分销商 II
1,464 次查看

 

--- Quote Start ---  

Hi, 

My clock speed is 20MHz but if I want to do more complicated calculations or change the signals width it doesn't fit. It's generated by external oscillator. 

 

Best regards, 

VT 

--- Quote End ---  

 

 

 

Hi VT, 

 

did you write your arithmetic functions by yourself ? 

 

Kind regards 

 

GPK
0 项奖励
Altera_Forum
名誉分销商 II
1,464 次查看

 

--- Quote Start ---  

Hi VT, 

 

did you write your arithmetic functions by yourself ? 

 

Kind regards 

 

GPK 

--- Quote End ---  

 

 

The math functions are asynchronous, generated by Altera's MegaWizard Plug-In. 

 

By the way what does the following warning mean: 

Warning: Synthesized away the following LCELL buffer node(s): Warning (14320): Synthesized away node "math_mult:math_mult_inst|lpm_mult:lpm_mult_component|mult_k3n:auto_generated|le10a" Warning (14320): Synthesized away node "math_mult:math_mult_inst|lpm_mult:lpm_mult_component|mult_k3n:auto_generated|le10a" ... Thanks, 

VT
0 项奖励
Altera_Forum
名誉分销商 II
1,464 次查看

 

--- Quote Start ---  

The math functions are asynchronous, generated by Altera's MegaWizard Plug-In. 

 

By the way what does the following warning mean: 

Warning: Synthesized away the following LCELL buffer node(s): Warning (14320): Synthesized away node "math_mult:math_mult_inst|lpm_mult:lpm_mult_component|mult_k3n:auto_generated|le10a" Warning (14320): Synthesized away node "math_mult:math_mult_inst|lpm_mult:lpm_mult_component|mult_k3n:auto_generated|le10a" ... Thanks, 

VT 

--- Quote End ---  

 

 

Hi VT, 

 

the warning means that the synthesis engine found logic which could be without changing the design behaviour. 

 

As far as I know, all arithmetic functions supports so-called pipelining. That means that registers stage will be implemented in order to improve the timing. Of course the result will be available some clock cycles later. 

 

I have an example of a divider attached. Maybe it could help you. 

 

Kind regards 

 

GPK
0 项奖励
Altera_Forum
名誉分销商 II
1,464 次查看

 

--- Quote Start ---  

Hi VT, 

 

the warning means that the synthesis engine found logic which could be without changing the design behaviour. 

 

As far as I know, all arithmetic functions supports so-called pipelining. That means that registers stage will be implemented in order to improve the timing. Of course the result will be available some clock cycles later. 

 

I have an example of a divider attached. Maybe it could help you. 

 

Kind regards 

 

GPK 

--- Quote End ---  

 

 

Thanks for the replay.  

I know I could improve the timing by adding some pipeline but I want to understand on what those timings depend on. May be I should read more about the lower level of the synthesis. 

 

Best regards, 

VT
0 项奖励
Altera_Forum
名誉分销商 II
1,464 次查看

 

--- Quote Start ---  

Thanks for the replay.  

I know I could improve the timing by adding some pipeline but I want to understand on what those timings depend on. May be I should read more about the lower level of the synthesis. 

 

Best regards, 

VT 

--- Quote End ---  

 

 

Hi, 

 

I have a small drawing attached to show you how retiming works . 

 

Kind regards 

 

GPK
0 项奖励
Altera_Forum
名誉分销商 II
1,464 次查看

OK, I have one question now :). Does this mean that if I make a PLL clock multiplier, for example 80MHz and if timings are OK at 20MHz they would be OK at 80MHz as well. What I mean is that the synthesis is adjusted to the clock speed and the connection timings will be proportional to 20MHz? 

 

Best regards, 

VT
0 项奖励
Altera_Forum
名誉分销商 II
1,464 次查看

 

--- Quote Start ---  

OK, I have one question now :). Does this mean that if I make a PLL clock multiplier, for example 80MHz and if timings are OK at 20MHz they would be OK at 80MHz as well. What I mean is that the synthesis is adjusted to the clock speed and the connection timings will be proportional to 20MHz? 

 

Best regards, 

VT 

--- Quote End ---  

 

 

Hi, 

 

as long as you have enough register stages defined it should work. Of course there is a limit, because more pipelining also means higher device utilization. I have a small divider 

example attached.  

 

Btw. I mixed up some items: 

 

Pipelining: Means that you or the tool puts some additional registers stage in your design. 

Your latency changed, the result is some clocks cycles later available. 

 

Retiming: That is an additional feature to speed up your design. With this feature enabled, the synthesis tool tries to move register through your logic in order to improve the clock speed. The latency did not not change. Hopefully your not to confused now. Sorry. 

 

Kind regards 

 

GPK
0 项奖励
回复