Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
20641 Discussions

Cyclone IV E Speed Grad and Design Issues

Altera_Forum
Honored Contributor II
1,678 Views

Hallo 

 

We have a test hardware with the DE0 NANO Board which has the FPGA EP4CE22F17C6. The target Fmax for our design is 50 MHz and it can be achived in the FPGA thats on the DE0 NANO board with a speed grade of 6. Now when we try to shift the design to EP4CEF17C8, where is 8 being lower in speed grade compared to 6, our restricted Fmax comes down to 40 MHz, for the same design and same setup. 

 

Can any one explain me the fundamental difference between the two speed grades and why do i need to change the whole design setup. Because i get a message from Timequest analyzer saying that there is a long combinational path when using speed grade 6, but no message like that when using speed grade 8.  

 

What surprises me is that from the altera website i have seen that C8 can go upto 400 Mhz and C6 upto 300 MHz. so for both of them 50 MHz would not be a problem i would guess.
0 Kudos
8 Replies
Altera_Forum
Honored Contributor II
417 Views

 

--- Quote Start ---  

We have a test hardware with the DE0 NANO Board which has the FPGA EP4CE22F17C6. The target Fmax for our design is 50 MHz and it can be achived in the FPGA thats on the DE0 NANO board with a speed grade of 6. Now when we try to shift the design to EP4CEF17C8, where is 8 being lower in speed grade compared to 6, our restricted Fmax comes down to 40 MHz, for the same design and same setup. 

--- Quote End ---  

 

 

There is no surprise that you are getting lower Fmax if you migrated your design to higher speed grade C8 (means slower device). 

 

 

--- Quote Start ---  

Can any one explain me the fundamental difference between the two speed grades 

--- Quote End ---  

 

 

Speed grade in older devices means actual delay thorough macrocell. In Cyclone family speed grade is relative performance of device. Higher speed grade means slower device (C8 is slower than C6). I think IC devices differ due to some variation in manufacturing process and after manufacturing IC is tested and slower devices are "thrown to one bin" (Higher speed grade for e.g. C8) and faster devices to "other bin" (Lower speed grade for e.g. C6).  

 

 

--- Quote Start ---  

What surprises me is that from the altera website i have seen that C8 can go upto 400 Mhz and C6 upto 300 MHz. so for both of them 50 MHz would not be a problem i would guess 

--- Quote End ---  

 

 

Those numbers are in best case scenario and does not mean that you will achieve those numbers because it heavily depends from your design. For example if there is timing path between two registers and no combinational logic in between you might achieve specified performance in data-sheet but if you add long combinational path your design performance will suffer and you will no longer achieve specified performance.
0 Kudos
Altera_Forum
Honored Contributor II
417 Views

hallo vlrean 

 

thanks a lot for your detailed explanation. So it means when migrating the design to speed grade C8 from C6, we need to reconstruct our design to make sure that the desired Fmax is achieved.
0 Kudos
Altera_Forum
Honored Contributor II
417 Views

Yes exactly. You can start by looking for the longest combinatorial paths reported by timequest and see if you can change the code there, or add some pipelining.

0 Kudos
Altera_Forum
Honored Contributor II
417 Views

 

--- Quote Start ---  

Yes exactly. You can start by looking for the longest combinatorial paths reported by timequest and see if you can change the code there, or add some pipelining. 

--- Quote End ---  

 

 

That's what i have done, i have identified the problematic paths. Adding pipeline registers solve the problem with Fmax but the problem is it somehow destroys the final output. We have a custom made noise shape block at the output and that needs a lot of calculation thats what making it slower, adding pipeline introduces delay blocks in the end, which is making the final output go wrong. We need to find a way to change our design to fit the timing.  

 

I am using Simulink and MATLAB HDL coder to generate the VHDL code and then i compile it in Quartus.
0 Kudos
Altera_Forum
Honored Contributor II
417 Views

Yes you probably need to be sure your data flow is correctly synchronized. Do you have a test bench for your component? It helps a lot seeing differences after a code change.

0 Kudos
Altera_Forum
Honored Contributor II
417 Views

I dont have a test bench in general, but i can generate validation model from MATLAB, which is the same model in Simulink as VHDL code with all the pipeline registers and other HDL coder optimizations. I compare it there. Can you please tell me how to check if my data flow is correctly synchronized??? 

Would adding a PLL help? although at this moment i have no idea how to add a PLL to a simulink design
0 Kudos
Altera_Forum
Honored Contributor II
417 Views

It strongly depends on how the code is architectured. But basically each time a block needs inputs from several other blocks to produce an output, you need to be sure that both inputs are provided on the same clock edge, or at least that the output is only generated once both inputs are valid. The problem when you start to add pipelining is that you can have some data in one flow that arrives one or several clock cycles after the data in another flow, and if this is not properly taken into account, the block using both those flows as inputs will produce bad output.

0 Kudos
Altera_Forum
Honored Contributor II
417 Views

Hi, 

 

For Altera Cyclone FPGAs they do not mention the fmax the core logic can operate at. The device speed grade is only an indication of the relative delay tpd of the device. As stated above, the higher the grade, the lower the performance. Thus a speed grade of -6 is faster than that of -8. The speed grade also determines the fmin and fmax of the PLLs and clock circuitry used in the FPGAs.  

 

The fmax of any design implemented on FPGA is dependent on the architecture, critical paths and logic utilization of the design. For example, if your design uses a lot of LUTs and has a large combinational loop, then the fmax will be drastically affected and reduced. To get maximum performance of any digital design on FPGAs, you need to make sure combinational paths are minimum and flop are used more (FPGAs have more of Flops than combinational circuits). Making sure your critical path does not exceed the device speed grade ( for example 8ns for a C8 device) will ensure that you are able to meet the maximum possible performance of an FPGA.
0 Kudos
Reply