- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am a student and for my final year project am carrying out an investigation into the hardware implementation of AES-128 in an Altera FPGA - using SystemVerilog. In the AES algorithm there are 2 mandatory look up tables known as the SBox and Inverse-SBox which both hold 256 x 8bit values. Currently I am using a 2 separate array's of bytes to store these values, however I want to look at the possibility of using some of the on-fpga memory bits - probably in the form of a ROM however I have very limited knowledge of this and was looking for a push in the right direction. My concern is that ROM's have a clock cycle delay, which will obviously slow down my overall design, however for my project it will be a good discussion. Is there a way to implement a ROM in the memory bits without this clock cycle delay? Some asynchronous ROM? Any replies would be greatly welcome :)Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can refer Altera's cookbook. I dont think you can infer asynchronous ROM. I don't understand the meaning of "slow design". If you mean latency, proper pipelining will decrease the overall latency rather than increasing it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The M*K blocks in Altera's FPGAs can only implement synchronous RAM/ROMs. Ie, they have the 1 cycle delay.
Asynchronous ROMs need to be implemented in LUTs.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you both for your replies - very much appreciated.
My main issue with using a Synchronous ROM is as follows... My design uses a FSM, having 13 states - and on each clock cycle the state is increased (generally speaking, in each state, 1 round of encryption/decryption is performed). When using an array, thus the LUT's the instruction... for (shortint c = 0; c < 4; c++) begin State[r][c] = Sbox[State[r][c] >> 4][State[r][c] & 8'h0f]; end Would carry out all 4 substitutions within a single clock cycle, however by using a ROM that piece itself would take 4 clock cycles. How would I synchronise my current FSM using a ROM - would I have to increment states every X clock cycles instead of every 1?- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The solution is to simply use 4 synchronous ROMs, one for each "c". Or actually, 2 since each M*K block has two independent read ports.
So, AFAIK as I can see, you can use M*K blocks for your problem, without performance penalty. To use them, you basically two options. One, you can use a ROM function such as LPM_ROM or ALT_ROM. LPM_ROM is portable, ALT_ROM has more features, such as 2 ports. It will require some changes to your code and you need to write a .HEX/.MIF file for the ROM's contents. The other solution, which I prefer, is to infer the ROM from your code. This will require you to change your code to follow a ROM template Quartus can recognize, which may involve a bit of trial and error. Take a look into the HDL coding guidelines to see which templates are supported and start from a simple case. Also, Quartus may also decide that, despite your efforts, your ROM is best implemented in LUTs.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great reply- thanks!
However, even if I was to use 4 x ROM (or 2 x Dual Port), I don't understand how there can be no performance penalty? As you still need to wait a full clock cycle to get a result?- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You need to be able to generate the ROM's read address in the clock cycle before you need the data.
I was looking at your code snipped and it looked possible. But I may be wrong here.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK great - thank you very much for your help here :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Why don't you use a single ROM and change states of your state machine after 4 cycles

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page