Re: Full compulation for large IO pins

Altera_Forum · ‎09-03-2017

Hi

I would like to perform full compilation for component (not the Top Level) that has large IO pins. Im looking at over thousands bits in output bits. But I need to know the total LEs and Fmax of the that particular component. Any ways to do so? Apparently there would not be any device that can provides such large pins. Thanks

Altera_Forum · ‎09-03-2017

--- Quote Start ---

Hi

I would like to perform full compilation for component (not the Top Level) that has large IO pins. Im looking at over thousands bits in output bits. But I need to know the total LEs and Fmax of the that particular component. Any ways to do so? apparently there would not be any device that can provides such large pins. Thanks

--- Quote End ---

To compile a design, you have to select a particular architecture (ie, CycloneV, for example) and a specific device and package within that architecture. You can't compile for a device that does not physically exist. Can you explain further what you are attempting to do?

Altera_Forum · ‎09-03-2017

IF you're just getting resource estimations for a block, then simply create a new project containing the block and sub-modules you need to compile it. In the assignments editor, create an assignment for Virtual_pins, and assign it to *. That will then ensure all IOs for that module are assigned virtual pins allowing you to compile your module and get logic usage. You will still need to pick a device - I suggest using the same device as your eventual target.

Altera_Forum · ‎09-04-2017

--- Quote Start ---

IF you're just getting resource estimations for a block, then simply create a new project containing the block and sub-modules you need to compile it. In the assignments editor, create an assignment for Virtual_pins, and assign it to *. That will then ensure all IOs for that module are assigned virtual pins allowing you to compile your module and get logic usage. You will still need to pick a device - I suggest using the same device as your eventual target.

--- Quote End ---

Yes you are right. That is exactly what I wanted to do. And later I found out that the full compilation took a very long time and has yet to be successful. I found this message on the console:

Warning (16618): Fitter routing phase terminated due to routing congestion. Congestion details can be found in Chip Planner.

any advice?

Altera_Forum · ‎09-04-2017

Without seeing the code, I cannot really say alot.

But long compiliation times for idividual modules (eg >30 mins) is likely a problem in the code and the easiest way to kill it is to have a large ram that gets inferred as registers instead of Ram blocks because the behaviour is incorrect.

Can you post your code?

Altera_Forum · ‎09-04-2017

--- Quote Start ---

Without seeing the code, I cannot really say alot.

But long compiliation times for idividual modules (eg >30 mins) is likely a problem in the code and the easiest way to kill it is to have a large ram that gets inferred as registers instead of Ram blocks because the behaviour is incorrect.

Can you post your code?

--- Quote End ---

Basically the top module is as such:

module KC(clk, reset, in, out);

input clk;

input reset;

input [575:0] in;

output [511:0] out;

wire [1599:0] r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, r10,r11;

wire [1599:0] r12,r13,r14,r15,r16,r17,r18,r19,r20,r21,r22,r23;

assign out = r23[1599:1599-511];

mKC

round0 ( clk, reset, 8'h01, {in, 1024'b0 } , r0);

mKC

round1 ( clk, reset, 8'h32, r0, r1);

mKC

round2 ( clk, reset, 8'hba, r1, r2);

.

--basically there are 23 identical portmap as above

endmodule

and each of the mKC module contains several logical expression with output is registered.

Altera_Forum · ‎09-04-2017

Why are those busses so huge? what are you trying to do?

What is this block?

Altera_Forum · ‎09-04-2017

Keccak hash function with 24 rounds

Altera_Forum · ‎09-04-2017

Is there any reason you're not doing this using ram?

Altera_Forum · ‎09-04-2017

--- Quote Start ---

Is there any reason you're not doing this using ram?

--- Quote End ---

For now, I just want to explore various implementation methodology. Initially, I was thinking to customized some optimization of the function but seems like my approach is not practical for implementation

Altera_Forum · ‎09-04-2017

If you have large busses where bits interact with a large number of other bits, then the routing (and timing) will suffer. FPGAs benefit from pipelining, so often a design with high latency get process much more data than a huge parrallel implementation with low latency.

Altera_Forum · ‎09-04-2017

--- Quote Start ---

If you have large busses where bits interact with a large number of other bits, then the routing (and timing) will suffer. FPGAs benefit from pipelining, so often a design with high latency get process much more data than a huge parrallel implementation with low latency.

--- Quote End ---

So in that case, would inserting more pipelines would help?

Altera_Forum · ‎09-04-2017

Possibly, I have no idea how you've implemented the design (and I have no experience with encryption algorithms)

If it's currently some large amount of interdependent logic, the pipelines could reduce the interdependence per clock cycle (and hence reduce routing requirement)

Altera_Forum · ‎09-04-2017

--- Quote Start ---

Possibly, I have no idea how you've implemented the design (and I have no experience with encryption algorithms)

If it's currently some large amount of interdependent logic, the pipelines could reduce the interdependence per clock cycle (and hence reduce routing requirement)

--- Quote End ---

Alright! thanks for the advice. I shall explore further on this!