Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)

Timing issues

Altera_Forum
Honored Contributor II
1,953 Views

Hello all, 

I am not an expert in FPGA design, so excuse me if I am asking some silly question. 

In my design, I need to use a a 16bit CRC over a 64bit word. I found one design in opencore.org which provides a parallel implementation. However, due to it's parallel nature, the design was synthesized into many levels of combinational logic gates which means a significant propagation delay. I am hoping to have this design running at 80MHz (12.5ns).  

First, I only constrained my clock to be 80MHz. Time Quest giving me very bad timing results for setup slack. The worst case slack is even longer than -12.5ns. So I am looking at some signals even missing the 2nd rising edge as the latch edge. Now, how do I fix this issue? I tried to put a constrain of max_delay to be Xns, but it doesn't seem to solve the issue. Any suggestions? 

 

Thx
0 Kudos
7 Replies
Altera_Forum
Honored Contributor II
889 Views

Wow. That's a lot of logic. Check out: 

http://www.altera.com/literature/manual/stx_cookbook.pdf?gsa_pos=1&wt.oss_r=1&wt.oss=cookbook 

or check out: 

http://www.easics.com/webtools/crctool 

 

How many clock cycles do you need the result in? The fastest and smallest CRC is to just do 1 bit at a time, but that would take 64 clock cycles. If your system has no problem with that, it should be pretty timing. These algorithms roll up the output of 8 or 16 or whatever operations into one clock cycle. 

By the way, CRC generation has probably been one of the most vexing things I've worked on. Even with a known polynominal, there are some other parameters like everything starts at 1, or you invert the input and/or the output. If you're off from those, then the output just looks like noise(CRC's are similar to random number generators). Just an FYI to have a good test in place. (Unless you're calculating the CRC to send and using the same algorithm to calculate the CRC on the other side to see if they match. In that case, as long as you're consistent it all works out. My problem involved creating a CRC to match the output of some other hardware that didn't fully describe how they calculated their CRC)
0 Kudos
Altera_Forum
Honored Contributor II
889 Views

I don't really need the result in one clock cycle, I could probabaly let it stablize for one or two clock cycle...but just out of curiosity, how fast can I make it running. 

 

yes..I am implementing a transmitter and a receiver, so on both side, I'll have similar CRC structures.. 

 

BTW..If my time quest says I have timing issues (setup, hold...), but my timing simulation (both slow and fast mode) suggests that the design is ok. Can I ignore the warning from Time Quest? or I should ALWAYS try to eliminate the warnings by put constrains and others? 

 

Thx
0 Kudos
Altera_Forum
Honored Contributor II
889 Views

It's not really "stabilizing", it's pipelining. Since it's a feedback structure, you'd take a couple clock cycles to get your output, and couldn't put a new input into it. So you're not calculating at the line rate.  

I would like at some other implementations. In the current one, you have 40 levels of logic with no carry-chains, which won't go near 80Mhz.  

Yes, I would trust static timing analysis and meet timing. Not sure why your timing simulation would work unless it's a multicycle, but on anything that feeds back like this, I doubt it. Basically it's saying when data is launched from the source register, it will not make it to the destination register in time, and on the next clock the destination register will clock in some unknown value. Again, I'm surprised your simulation worked. One possibility is that timing sims are exact, so they will calculate the exact same value every time because the delay it uses is exact. But in hardware your delays will vary some, and the CRCs will calculate different values. Just a guess.
0 Kudos
Altera_Forum
Honored Contributor II
889 Views

Anyway, go to the easics site, put in your polynomial and set the databus width to 64. That means you'll get your result in 1 clock cycle. See what it runs at. If it doesn't meet timing, then set it to 32, so you'll require 2 clock cycles(but it should run much faster). See what you can get. 

Also try the cookbook, which should be more optimized, but not as easy to modify.
0 Kudos
Altera_Forum
Honored Contributor II
889 Views

I guess the reason why my simulation can work is because the data input is not on EVERY rising edge of the clock...i.e. the line rate is not 80MHz..As a matter of fact, it's coming quite slow..so every few clock cycle, I have one word (64bit) to process, that's when I will enable the CRC (by par_clken high)...then how can I solve this issue?

0 Kudos
Altera_Forum
Honored Contributor II
889 Views

Since you're doing it in 1 clock cycle, that makes sense. (If you were putting in 32 bits and then another 32 bits over two clock cycles, then it would fail). Anyway, put: 

set_multicycle_path -setup 2 -to CRC:inst|ucrc_par:PGEN|match_o 

set_multicycle_path -hold 1 -to CRC:inst|ucrc_par:PGEN|match_o 

set_multicycle_path -setup 2 -to CRC:inst|ucrc_par:PGEN|crc* 

set_multicycle_path -hold 1 -to CRC:inst|ucrc_par:PGEN|crc* 

 

This says that there are two clock cycles for the data to get through(basically your ignoring whatever these registers clock in immediately after the data is sent.) If you can wait 3 clock cycles, then change it to 3 and 2 respectively. Note that I haven't disected the design, as you want to be careful to add multicycles only to paths that are true multicycles, or else you may pass timing analysis and fail on the board.
0 Kudos
Altera_Forum
Honored Contributor II
889 Views

Let me try & Many thanks

0 Kudos
Reply