Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16594 Discussions

FMAX on Cyclone V is ~50% slower than other Cyclones and Max 10.

BrianHG
New Contributor I
910 Views

Hello,

    I'm having difficulty reaching and FMAX target on a Cyclone V device where the same design compiled on a MAX 10 and Cyclone IV E device achieves the target no problem.

 

  In fact, the Cyclone V is an utterly horrible 53% the speed of the older Cyclone & Max 10 device. Here are the compile results for the PLL clock[4] output:

 

Max 10          10M50DAF484C6GES, 116MHz, 5:30min compile time.
Cyclone IV E  EP4CE30F23C6,           115MHz, 3:46min compile time.
Cyclone V      5CEFA4F23C6,              62MHz, 10:54min compile time.

 

    Cyclone V is ~50% slower. The minimum target is supposed to be 100MHz. The lemon nets in question are between shift registers which are setup to emulate a FWFT fifo.

 

    I've attached the .zip file 3 Quartus projects in question. The results I've listed above were obtained with Quartus Prime 20.1. Each of the following 3 folders are full projects which just need compilation to rebuild all the missing files and obtain the FMAX results I listed.

 

Folders:

BrianHG_DDR3_DECA_GFX_DEMO - This project folder contains a functional 10M50DAF484C6GES build for Arrow DECA MAX 10 FPGA development board.


BrianHG_DDR3_CIV_GFX_FMAX_Test - This project folder contains a EP4CE30F23C6 build of the same project.


BrianHG_DDR3_CV_GFX_FMAX_Fail - This folder contains the LEMON Cyclone V 5CEFA4F23C6 build which cannot get an FMAX anywhere close to the other 2 builds.

 

    Folders: 'BrianHG_DDR3' and 'BrianHG_DDR3_GFX_source' contain the additional shared source code required for the project to work.

 

    Is there anything I've done wrong? Even the compile time is twice as long. The PLL output clock[4] is a really deep in the core of the FPGA design and has no real ties to IO timing like the first 3 PLL output clocks which reach their desired 400MHz. Is there some sort of compiler setting to get the Cyclone V to at least achieve 90% of the older Cyclones if not the same performance? Or, is the Cyclone V truly a half speed FPGA?


FMAX clock requirements:
PLL CLK 0,1,2 - 400 MHz each.
PLL CLK 3 - 200 MHz.
PLL CLK 4 - 100 MHz.

 

0 Kudos
1 Reply
BrianHG
New Contributor I
819 Views

Is there anything I can do?

The portion of my code in question, the middle of my 'BrianHG_DDR3_COMMANDER.sv' running on PLL CLK 4 - 100 MHz boils down to a 4:1 selector mux, ~168 bits wide, selection based on comparing 4 inputs with 4x2 compares to a stored 28 bit address easily surpasses the required 100MHz on Cyclone III, Cyclone IV, Max 10, not to mention the faster fabrics, but for some reason, it really dies on a -6 Cyclone V.

 

My 200MHz and 400MHz sections are nothing block shifting more than serializers, so the Cyclone V appears to be able to cope with those.  But why is it such a massive downgrade from all the older FPGAs?

 

Even worse, if I leave the compile options on the default 'Balanced', the FMAX is only ~ 41MHz while the other Cyclones will still pass the required 100MHz.   There isn't much I can do to simplify the 4:1 selection mux which feeds the next source code section of my design.  Multistage pipeline would destroy the codes ability to correctly select which of the 4 inputs should be prioritized to run next.

 

Is there something in the Cyclone V fitter's settings which has somehow decided to massively impact the core's performance?

Or, once again, should I consider the Cyclone V to be truly a half speed FPGA and scratch it from my list of potential devices I can use?

 

0 Kudos
Reply