Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
21615 Discussions

Multicycle path – enable signal, basic questions

Altera_Forum
Honored Contributor II
3,288 Views

Hello guys,  

 

I have a few basic questions about multicycle paths. I have read many documents (i.e. TimeQuest User Guide) and many threads on this forum which are somehow connected with constraining multicycle paths, but I still have some doubts. 

 

Let’s say I am working with example like this: 

 

process(clk) begin if rising_edge(clk) thena_reg <= a; b_reg <= b; c_reg <= c; d_reg <= d; e_reg <= e; f_reg <= f; g_reg <= g; h_reg <= h; o <= x1_mul4_x2; end if; end process; a_mul_b <= a_reg * b_reg; c_mul_d <= c_reg * d_reg; e_mul_f <= e_reg * f_reg; g_mul_h <= g_reg * h_reg; x1_mul1_x2 <= a_mul_b(2*N-1 downto N) * c_mul_d(2*N-1 downto N); x3_mul1_x4 <= e_mul_f(2*N-1 downto N) * g_mul_h(2*N-1 downto N); x1_mul2_x2 <= x1_mul1_x2(2*N-1 downto N) * x3_mul1_x4(2*N-1 downto N); x1_mul3_x2 <= x1_mul2_x2(2*N-1 downto N) * x1_mul2_x2(2*N-1 downto N); x1_mul4_x2 <= x1_mul3_x2(2*N-1 downto N) * x1_mul3_x2(2*N-1 downto N); 

 

I have clock f = 50 MHz, and after compilation (in Timequest I have constrained only clock), my fmax is ~35 MHz. I do not need to have result on every clock, so to achieve better fmax I can use multicycles. 

 

I have three options and which one is correct in my case? 

 

1. I found somewhere that all I need to do is add multicycle constraints as below, but I read here (http://www.alteraforum.com/forum/showthread.php?t=5576&p=22537#post22537)that multicycles can’t be used in situation where data is changing every clock. And I agree with this. Am I right? Adding only multicycles when all registers are changing every clock is not enough and this is improper use of multicycle constraints (this is not a multicycle)?  

However I added constraints, compiled project and got fmax ~ 60 MHz. Is it mistake, because constraints were used wrong? 

 

set_multicycle_path -setup -end -to ~reg0}] 2 set_multicycle_path -hold -end -to ~reg0}] 1 

 

2. Next I decided to add enable signal, but only for destination registers.  

enable <= not enable; if enable = '1' then o <= x1_mul4_x2; end if; 

 

Now all source registers are cycled with f = 50 MHz, and destination registers work with f = 25MHz (but clock still is 50 MHz). Can I use here multicycles and will they work properly? I saw a lot of examples where data can be transferred from fast clock domain to slower one. Do I meet this situation here? Can I just add enable signal for destination registers, multicycle constraints and that’s all?  

After compilation, I got fmax ~ 60 MHz. 

 

3. I added enable signal to all (source and destination) registers.  

 

enable <= not enable; if enable = '1' then a_reg <= a; b_reg <= b; c_reg <= c; d_reg <= d; e_reg <= e; f_reg <= f; g_reg <= g; h_reg <= h; o <= x1_mul4_x2; end if;  

 

And now I have clear situation. I have multicycle path, where all registers (launch and latch) are clocked with f = 50 Mhz and have enable signal every second cycle.  

Can I use multicycles here? 

After compilation, I got fmax ~ 60 MHz. 

 

Now to sum up. I described three different cases. When I have data which needs more than one clock cycle to propagate form register to register, the only (best?) solution is to add enable signal to both registers (source and destination) and add multicycle constraints? Third point is the most proper? But what about first two cases? Are they always bad, or they can be also used in some situations? Main question is, source and destination registers should be toggled by enable signal or only destination register has to have enable signal when I want to treat the path as a multicycle? 

 

Regarding point 2. When I transfer data from clk1 to clk2, where source domain is 2x faster (clk1 = 2* clk2), should I add enable signals to registers in clk1 domain? Because I think data should be stable there for two cycles clk1? 

 

Could someone shortly answer to all my question? 

 

Regards, 

kolas
0 Kudos
5 Replies
Altera_Forum
Honored Contributor II
2,413 Views

What's your incoming data doing? If a,b,c,d,etc. have new data every other clock cycle, then Case 3 just ignores every other bit of data coming in and therefore ignores every other calculation you want done. So yes, it is correct in that the paths multicycled all hold their data every cycle, but wrong for how your design should work.  

If the data doesn't change every clock cycle and holds its value for two cycles, then Case 1) or 2) should work.  

Other ideas: 

- Assuming the logic changes every cycle, you could add a parallel path of multipliers that do the same thing and feed every other value into each path, using an enable to hold the value for two cycles. Then on the output mux the two paths back together. This is a lot of extra logic though, but one way to hand the high data rate. 

- Multicycles are similar to just pipelining the path. Rather than multicycling it at all, add a register stage into the middle of path. The reason multicycles are sometimes preferred is that you don't have to "find the middle of the path" and add a register. For example, if you clock period were 10ns and you had a data path that was 19.5ns long, you would have to exactly find a mid-point to add a register such that the new path to this register and from it are less than 10ns in length. This can be difficult and not an issue with multicycles. But if you can do it, I think adding registers is often better in that all the misunderstandings and gotchas of multicycles disappear, the design works as coded and toggles data on every clock cycle. (Plus you can run your data at the full throughput. Multicycles like what you're doing assume you are not). 

- One last point, it is recommended not to use multicycles like this in Stratix 10 and instead register the path. The beauty with S10 is that the registers added are basically free, and you don't have to find the midpoint of the path since retiming can push the registers to the midpoint. There are other benefits beside that.
0 Kudos
Altera_Forum
Honored Contributor II
2,413 Views

Rysc,  

 

I am a little bit confused: 

 

 

--- Quote Start ---  

If a,b,c,d,etc. have new data every other clock cycle, then Case 3 just ignores every other... 

--- Quote End ---  

 

 

Shouldn't be there: "every clock cycle"? Because if there were "every clock cycle", everything would be clear for me. I understand that in Case 3, if I have new data on every cycle and enable signal which is high every other cycle, I lose my calculations every other cycle, correct? That is way you wrote  

 

--- Quote Start ---  

...but wrong for how your design should work 

--- Quote End ---  

 

Rest of your post is clear for me. Thanks for sugestions. I never thougth in way that I can add parallel path.  

 

Regarding Case 3, to clarify. If I have new data on every clock cycle, I can add enable signal which will generate multicycle paths in the design, but I will loose calculation results every other clock cycle, correct? 

 

And last thing. If I have new data on every clock cycle (with no enable signal, so this is Case 1), I can't just add multicycle contraints (because there is no multicycles), I have to redesign the code?  

 

Regards,  

kolas
0 Kudos
Altera_Forum
Honored Contributor II
2,413 Views

Yes, that was a typo. (I actually made the typo a few times while writing that up but fixed the other ones). 

Yes, if you have data changing on every cycle but add the enable, then it can be multicycled but you lose every other calculation.  

If the data were changing every other cycle, then you could add multicycles and wouldn't even need an enable on the source registers. On the destination registers, you might not need an enable either. If the logic the destinations drive only grab data on the correct cycle, it would work. Basically your destinations would alternate between being good1/bad/good2/bad/... The enable would prevent the bad data from being clocked in, and your output would look like good1/good1/good2/good2/... So you would always have good data, it would just repeat.
0 Kudos
Altera_Forum
Honored Contributor II
2,413 Views

Thanks for explanations and ideas,  

now I will know what to do with multicycles! 

 

Regards, 

kolas
0 Kudos
Altera_Forum
Honored Contributor II
2,413 Views

 

--- Quote Start ---  

Thanks for explanations and ideas,  

now I will know what to do with multicycles! 

 

Regards, 

kolas 

--- Quote End ---  

 

 

if your data rate is same as clock rate you should not consider any multicycle at all. 

Your problem is that you have two layers of multipliers without registers.Moreover by inferring mults you cannot insert pipeline inside mults. 

so having path from input through 2 mults to output is pretty long path. solution insert registers between the two mult stages or even inside mults by instantiating mults with higher latency.
0 Kudos
Reply