How to close timing with Negative Setup Slack

Altera_Forum · ‎05-13-2016

Hi,

I am working on a hardware design working at a clock frequency of 200MHz(5 nsec). Setup violation of -0.265 ns is reported by quartus tool. The source clock and the destination clock are same, and all the inputs and outputs to and from the block are registered. The critical path delay looks to be from a ripple carry adder, and this is crucial for my design. How can I go about using this hardware reliably? How bad is a slack of -0.265 ns with a clock frequency of 5 nsec?.If I change the clock uncertainty, what are the implications?

Reference: FPGA - Stratix V 5SGXMA7K2F40C2

Regards

Jeebu

Altera_Forum · ‎05-13-2016

Have you already the best options in project settings?

I mean enable Time driven synthesis and maximum routing effort.

Altera_Forum · ‎05-18-2016

Thanks @Cris72 for the reply.

I tried changing to maximum routing effort, but in vain. Any other suggestions? If the hardware is working at ambient conditions(Not quite sure of clock jitter though), will it work reliably?.

Regards

Jeebu

Altera_Forum · ‎05-18-2016

Ripple carry over how many bits? If it is small (ie, 8 or less bits) ripple carry is probably the fastest.

However, if it is a 16 bit or more adder a carry lookahead implementation will likely be higher performance.

What is the timing slack on the next 5 or 10 worst case nets? Is it OK, or also in the negative?

If you have a large blob of combinatorial logic that does not meet timing (ie, is greater than the clock interval minus the reg-reg delay) either you have to find a way to reduce the number of logic levels within the logic (shortening the delay) or possibly add internal pipeline registers within the combinatorial logic, such that you now have twice as much time to accomplish the function (of course this assumes externally you can tolerate the additional pipeline clock delay).

If this is just a one-off project than the slack of -265ps is tolerable given you operate at nominal voltage and more or less room temperature.

However, if this is a product that must operate over a wide environmental range (ie, 0'C to 55'C ambient) than you need to do some work.

Altera_Forum · ‎05-18-2016

Hi ak6dn, Thanks for the quick reply.

So the ripple carry is the result of synthesized hardware, as interpreted by synthesizer. Its a parallel 256 sized combinational logic, which got translated to ripple carry adder, and the critical path includes most of the adders, and hence lead to a negative slack of -0.265 nsec.

The next timing slack is also negative, but negligibly small (~ 5ps).

Regarding cutting down the combo delay with pipelined registers, The main objective of the my low latency hardware is to bring down the latency as far as possible, and hence I m trying to see angles, other than cutting down combo path(Since the negative slack is less than 1ns).

Yes, assuming, I m making the hardware work at ambient conditions at nominal voltages, Can I expect reliable operation ?. Any other suggestions?.

Regards

Jeebu

Altera_Forum · ‎05-18-2016

--- Quote Start ---

So the ripple carry is the result of synthesized hardware, as interpreted by synthesizer. Its a parallel 256 sized combinational logic, which got translated to ripple carry adder, and the critical path includes most of the adders, and hence lead to a negative slack of -0.265 nsec.

--- Quote End ---

So are you saying this is a 256 bit adder? 256 bit op1 plus 256 bit op2 gives a 256b result? Your answer above is unclear.

If indeed it is a 256 bit adder I'm not surprised at all a ripple carry implementation does not meet timing. You need to implement a multi level carry lookahead design (over four or eight bit operand slices) to improve performance.

If it is not a 256 bit wide adder then I don't understand your response.

Altera_Forum · ‎05-18-2016

Hi ak6dn,

What I meant to say is that, I made a 256 entry ordertable hardware, But during hardware translation, synthesizer would have created some logic(may be for address translation), to add ripple carry adders, as seen from the sta report(I can confirm from the RTL viewer, if you need more clarity).

Attached along is the sta report details for the violation, with actual instance location masked by _instancelocation_

Thanks again.

Jeebu

Altera_Forum · ‎05-18-2016

Ok, got it. From looking at the timing report it appears there is 9b adder that feeds into a 12b adder, and both are ripple carry implementations.

What you need to do is coerce Quartus to synthesize faster carry lookahead adders that (ideally) use the capabilities of the logic cells in the FPGA.

I found this app note: https://www.altera.com/support/support-resources/design-examples/design-software/vhdl/v_cl_addr.html but it refers to a much older version of the software and an older architecture. However, setting the tool to do more aggressive synthesis/optimization may still work.

Altera_Forum · ‎05-18-2016

--- Quote Start ---

Ok, got it. From looking at the timing report it appears there is 9b adder that feeds into a 12b adder, and both are ripple carry implementations.

What you need to do is coerce Quartus to synthesize faster carry lookahead adders that (ideally) use the capabilities of the logic cells in the FPGA.

I found this app note: https://www.altera.com/support/support-resources/design-examples/design-software/vhdl/v_cl_addr.html but it refers to a much older version of the software and an older architecture. However, setting the tool to do more aggressive synthesis/optimization may still work.

--- Quote End ---

Thanks a lot ak6dn,

Will try this out and be back with the observations..

Jeebu