Guidelines to avoid Negative Slack with State Machines

Altera_Forum · ‎06-29-2016

Hi,

I m working on a hardware which uses a hardware block (Lets call it x), which does communication with a host machine, does processing based on the request given by the host machine, and gives a response back once the processing is done.

X hardware is written as a state machine, with a total of 54 usable states. The coding is done as follows(Just a sample demonstration, exactly the way state machine is defined in my hardware)

always@(posedge clk)

begin

if(rst)

presentstate <= IDLE;

else

begin

presentstate <= nextstate;

case(nextstate)

IDLE:

STATE1:

STATE2:

STATEX:

default:

endcase

end

always@(*)

begin

nextstate = presentstate;

case(presentstate)

IDLE:

begin

if(condition1)

nextstate = STATE1;

end

STATE1:

begin

if(condition2)

nextstate = STATE2;

else if(condition3)

nextstate = STATEX;

else

nextstate = IDLE;

end

STATE2:

nextstate = STATEX;

STATEX:

nextstate = IDLE;

endcase

end

The states are defined using localparam, and are defined in a binary encoded fashion(But QuartusMap report says, the encoding was done as onehotcode)

localparam IDLE = 0;

localparam STATE1 = 1;

This hardware is consistently showing negative slacks after routing, with launch clock and latch clock being the same, and the from node and the to node, both within the hardware x.

Some of the incoming signals and some of the outgoing signals are not latched(To save a clock cycle). But with the To Node and From Node within the same hardware, Will latching the signals make a difference in the timing? Is there anything wrong with the way, the coding is done? Any suggestions on how to get rid of the negative slack?.

When you get a negative slack in the route report, how do you systematically debug it?. The route report shows long combinational path in terms of the net names generated in the synthesis, and how would you relate it to your code?.

Regards

Jeebu

Altera_Forum · ‎06-29-2016

--- Quote Start ---

When you get a negative slack in the route report, how do you systematically debug it?. The route report shows long combinational path in terms of the net names generated in the synthesis, and how would you relate it to your code?

--- Quote End ---

I'm not sure if it is related to your problem, but at least the following expressions are perhaps inferring too much combinational and sequential logic due to the non covered status on if expressions, particularly because you have a lot of states:

STATE1:
    begin
        if(condition2)
            nextstate = STATE2;
        else 
            if(condition3)
                    nextstate = STATEX;
            else
                    nextstate = IDLE;
    end

I would recommend you start thinking about use the when-case statement instead.

Altera_Forum · ‎07-05-2016

Jeebu -

Your state machine coding looks fine to me, although I don't care for the split combinational nextstate with registered presentstate style any more. There was a day when synthesis tools needed the help of that coding style, but modern tools don't need it and I find it a bit archaic. I know a lot of books still push that style, but don't believe everything you read.

That aside, do you have the design fully constrained for timing? And is that host machine that your state machine is interacting with in the same clock domain? If not, did you structure the design to handle clock domain crossing on the interface?

Bob

Altera_Forum · ‎07-07-2016

--- Quote Start ---

Jeebu -

Your state machine coding looks fine to me, although I don't care for the split combinational nextstate with registered presentstate style any more. There was a day when synthesis tools needed the help of that coding style, but modern tools don't need it and I find it a bit archaic. I know a lot of books still push that style, but don't believe everything you read.

That aside, do you have the design fully constrained for timing? And is that host machine that your state machine is interacting with in the same clock domain? If not, did you structure the design to handle clock domain crossing on the interface?

Bob

--- Quote End ---

Dear Bob,

Thanks for the reply,

Yes I m working with the same clock domain. So the framework I m working with, takes care of clock domain crossing.and gives me data in the same clock domain. Can you point me in some direction on how to systematically debug it?.

Sometimes the same code when build, will give positive slack, and sometimes negative slack. How do we reliably implement this state machine without any slack?.

Regards

Jeebu

Altera_Forum · ‎07-07-2016

--- Quote Start ---

Your state machine coding looks fine to me, although I don't care for the split combinational nextstate with registered presentstate style any more. There was a day when synthesis tools needed the help of that coding style, but modern tools don't need it and I find it a bit archaic. I know a lot of books still push that style, but don't believe everything you read.

Bob

--- Quote End ---

A two- or a three-process state machine is the better approach, unless you have something simple that fits perfectly in the synchronous process.

Back to the main issue: if a state machine is failing timing there are 2 possibilities:

the required clock frequency is simply too high
there is too much decision logic in the state transitions

Usually it is the second case. This happens e.g. when you test a large counter for a certain value.

Altera_Forum · ‎07-09-2016

--- Quote Start ---

A two- or a three-process state machine is the better approach, unless you have something simple that fits perfectly in the synchronous process.

--- Quote End ---

This could definitely be debated, josyb!

Altera_Forum · ‎07-09-2016

--- Quote Start ---

This could definitely be debated, josyb!

--- Quote End ---

Try me!

Almost every state machine I write, inside an Avalon_ST module, has asynchronous outputs. Can you show me how to do that with a synchronous (only) process?

Altera_Forum · ‎07-09-2016

--- Quote Start ---

Try me!

Almost every state machine I write, inside an Avalon_ST module, has asynchronous outputs. Can you show me how to do that with a synchronous (only) process?

--- Quote End ---

I believe by asynchronous you mean not assigned in a clocked process (but in a comb. section following a clocked process and so as part of comb cloud between registers).

I have done designs for last 16 years including Avalon interfacing and never used but a single clocked process.

In fact in the last two years I gave up state machine methodology altogether and just use a counter (which by itself is a state machine) and you can run as many states as your counter size allows with only drawback being using numbers instead of names. If a need a change to occur before next state then I can start the assignment at end of current state.

Altera_Forum · ‎07-09-2016

--- Quote Start ---

Try me!

Almost every state machine I write, inside an Avalon_ST module, has asynchronous outputs. Can you show me how to do that with a synchronous (only) process?

--- Quote End ---

Sure, just encode the combinational outputs separately, either with explicit assign statements or using a separate case statement. That's a separate issue than encoding the state machine itself. As a general rule I think it's bad practice to have combinational signals flow through a state machine unclocked to create combinational outputs. I understand that there are times where that's the only option, but those times are pretty rare in my experience. If you allow yourself to code that way as general practice then you probably end up creating combinational outputs even when NOT necessary. Bottom line is that I think the multi-process approach makes reading and debugging the code much more difficult. Especially if you don't give special names to clocked vs. non-clocked state machine outputs. If you don't do that then it can become very confusing. If combinational outputs are used as an exception then you can assign them separately with a comment block to explain the what and why.

I've been a consultant for a very long time so have worked in many different organizations. I try to adopt coding practices that make it as easy as possible for my customers to maintain my code after I leave. Good business practice. State machine coding is a big part of that, and I use state machines as often as I can because they can be self-documenting if you take the time to add appropriate comments.

This is all just my personal opinion based on many years of doing this stuff. Please don't take offense. I'm just trying to share my experiences and what works for me. All the Mealy/Moore multi-process state machine crap that gets taught in schools and perpetuated in books is just bad information IMO.

Edit: While I'm at it I may as well make half the world angry and do a little VHDL trashing. For the life of me I can't understand why anyone writing rtl code would want to use VHDL vs. verilog/systemverilog. It used to be that VHDL was much more capable for modeling, but systemverilog has changed that. For synthesizable rtl code VHDL is just way too much work to write and to read. If I never had to read a line of VHDL code again I would be a happy man!!

Bob

Altera_Forum · ‎07-09-2016

--- Quote Start ---

Sure, just encode the combinational outputs separately, either with explicit assign statements or using a separate case statement.That's a separate issue than encoding the state machine itself.

--- Quote End ---

To do this you have to copy-paste transition conditions from the state machine? Distributing them all over the code?

--- Quote Start ---

As a general rule I think it's bad practice to have combinational signals flow through a state machine unclocked to create combinational outputs. I understand that there are times where that's the only option, but those times are pretty rare in my experience. If you allow yourself to code that way as general practice then you probably end up creating combinational outputs even when NOT necessary.

--- Quote End ---

That's pure conjecture ... see also below.

--- Quote Start ---

Bottom line is that I think the multi-process approach makes reading and debugging the code much more difficult. Especially if you don't give special names to clocked vs. non-clocked state machine outputs. If you don't do that then it can become very confusing. If combinational outputs are used as an exception then you can assign them separately with a comment block to explain the what and why.

I've been a consultant for a very long time so have worked in many different organizations. I try to adopt coding practices that make it as easy as possible for my customers to maintain my code after I leave. Good business practice. State machine coding is a big part of that, and I use state machines as often as I can because they can be self-documenting if you take the time to add appropriate comments.

--- Quote End ---

You must assess that these combinational outputs eventually are used in some other synchronous process further down or up the hierarchy. E.g. I always instantiate a counter component if I need to count something. To load a downcounter with some value by the state machine I either need a combinatorial output, or in the case of a synchronous output load one value less. If I want to test the counter in the next clock, as the state machine has stepped to another state too, I have a problem with the external counter being loaded with the synchronous output, as the counter has not been loaded with that new value yet.

--- Quote Start ---

This is all just my personal opinion based on many years of doing this stuff. Please don't take offense. I'm just trying to share my experiences and what works for me. All the Mealy/Moore multi-process state machine crap that gets taught in schools and perpetuated in books is just bad information IMO.

--- Quote End ---

None taken.

I too couldn't care less either whether Moore or Mealy or the two together are doing the job. But your style is exactly one of those two, or am I mistaken? I had to look it up (searching at least 5 VHDL books) and it is moore. So I most often do mealy things :)

--- Quote Start ---

Edit: While I'm at it I may as well make half the world angry and do a little VHDL trashing. For the life of me I can't understand why anyone writing rtl code would want to use VHDL vs. verilog/systemverilog. It used to be that VHDL was much more capable for modeling, but systemverilog has changed that.

--- Quote End ---

I nowadays mostly code in myhdl (http://www.myhdl.org) ... and for VHDL I use a good editor: sigasi (http://www.sigasi.com) (Which also does some Verilog)

--- Quote Start ---

For synthesizable rtl code VHDL is just way too much work to write and to read. If I never had to read a line of VHDL code again I would be a happy man!!

Bob

--- Quote End ---

If we compare VHDL as being Pascal/Ada like and Verilog as C-like (as you Verilog-guys classify them), I can quote an old boss of mine: if you get a pascal program compiled without errors, you have a good chance it will run. with c on the other hand you are nowhere yet.

I once wrote some VHDL code to compare all these styles and posted it on all programmable planet, but the sponsor of that web-site pulled it off the air without a warning, and I seem to have lost the text ... I put the VHDL source code here: https://gist.github.com/josyb/a84d067f0a468d0931599f1891bc83ff

I haven't got much free time lately, but let's define a non-trivial exercise and compare the coding styles and the QoR.

One more defence: the latest book on VHDL: effective coding with vhdl, principles and best practice by ricardo jasinsky Page 510:"The two-process styles offers several advantages [over the one-process style]".

Josy

Altera_Forum · ‎07-09-2016

--- Quote Start ---

I believe by asynchronous you mean not assigned in a clocked process (but in a comb. section following a clocked process and so as part of comb cloud between registers).

I have done designs for last 16 years including Avalon interfacing and never used but a single clocked process.

In fact in the last two years I gave up state machine methodology altogether and just use a counter (which by itself is a state machine) and you can run as many states as your counter size allows with only drawback being using numbers instead of names. If a need a change to occur before next state then I can start the assignment at end of current state.

--- Quote End ---

I'd love to see that code. I might learn something new.

Altera_Forum · ‎07-11-2016

Thanks a lot @josyb and @rsefton for your inputs.

@josyb: Thanks for sharing your VHDL code.

Just a doubt to clear in verilog/VHDL. I see that the outputs are driven within the nextstate logic, and not in synchronous block(@(posedge clk)). I see in the josyb's coding that in the nextstate process() block, you have specified finite number of signals in the sensitivity list. But lets say in a similar scenario in verilog, we have the nextstate always@(*) block , and in presence of any asynchronous input signals, wont the output get corrupted. Is this a safe way of coding(Driving output in the nexstate logic)?.

Have gone through CliffordCummings paper http://www.sunburst-design.com/papers/cummingssnug2003sj_systemverilogfsm.pdf (http://www.sunburst-design.com/papers/cummingssnug2003sj_systemverilogfsm.pdf). Can anyone of you share some content/experience on different coding styles(preferrable verilog) and the hardware(Altera Specific) inferred with each case.

Regards

Jeebu