Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16755 Discussions

Effect of ROM port width of LE usage?

Altera_Forum
Honored Contributor II
2,749 Views

I created a single port ROM with a port width of 237 bits to meet my application need. However this seemed to cause the usage of around 15,000 additional logic elements. I don't think it's storing the data in logic elements, since the complilation manager shows the memory bits as being used, but is it possible something wacky is being done at synthesis to create a port of 237 bits? Is there an upper limit to port width before this happens? I'm working on a DE2 board.

0 Kudos
10 Replies
Altera_Forum
Honored Contributor II
694 Views

It sounds like it's using the LEs as LUT Ram. Why do you need such a wide bus? 

is it synchronous? have you tried attributes to put it into M4Ks?
0 Kudos
Altera_Forum
Honored Contributor II
694 Views

 

--- Quote Start ---  

It sounds like it's using the LEs as LUT Ram. Why do you need such a wide bus? 

is it synchronous? have you tried attributes to put it into M4Ks? 

--- Quote End ---  

 

 

Basic idea is I'm implementing a certain number of functional units on the FPGA. The number I can fit isn't sufficient to completely parallelize my process, so I'm multiplexing them. Each unit has a certain amount of fixed data associated with it, so let's say I can fit 50 of the board, but I need 100 computations done. So for a given input, I need to load the fixed data for the 50 funtional units on the board, latch it into flip flops, do the computation with the input, then load the fixed data for the other 50 that don't fit, and do the computation on the same input. 237 bits just happens to be the amount of fixed data for a given functional unit. I guess I could just make a smaller port width and have the transfer take a few cycles. Is that how something like this would normally be done? 

 

Thanks!
0 Kudos
Altera_Forum
Honored Contributor II
694 Views

Normally you would try and do the processing as the data arrived, so no need to store it explicitly (as it's stored in the processing pipeline).  

What you describe isnt a ROM, its a RAM.
0 Kudos
Altera_Forum
Honored Contributor II
694 Views

what if you try a small rom and instantiate it as many as you like then connect the 273 bits manually, all with same address. I don't see if it will need any extra logic

0 Kudos
Altera_Forum
Honored Contributor II
694 Views

 

--- Quote Start ---  

Normally you would try and do the processing as the data arrived, so no need to store it explicitly (as it's stored in the processing pipeline).  

What you describe isnt a ROM, its a RAM. 

--- Quote End ---  

 

 

Perhaps I didn't explain the situation clearly enough, or my understanding of what the two phrases mean isn't accurate. To simplify, let's say we only have space on the board for one unit, and there are two total. FuncUnit 0 and FuncUnit 1. FuncUnit 0 represents a physical entity that has 237 bits of fixed data associated with it. This 237 bits never changes. Now the machine gets 10 bits of input from the outside world. (Input A). It does computation on the 10 bits of input with the 237 static bits it has. It gives a result, now the multiplexing happens and the FPGA needs to load the 237 bits associated with FuncUnit1, and perform the calculation on the same 10 bits. When the NEXT 10 bit input comes in, the FPGA reloads the SAME 237 bits back in it had originally associated with FuncUnit0, and does the same computation on the new 10 bits of input, and continues alternating with FuncUnit1 in this fashion. The 237 bits never change, they are only being read. Whether or not they're being switched on every single input or buffered to handle chunks at a time doesn't really matter, I guess my question is what is the most efficient way to load 237 bits from a memory source periodically. 

 

Does this make more sense? Thanks again!
0 Kudos
Altera_Forum
Honored Contributor II
694 Views

 

--- Quote Start ---  

what if you try a small rom and instantiate it as many as you like then connect the 273 bits manually, all with same address. I don't see if it will need any extra logic 

--- Quote End ---  

 

 

I guess my only problem with this is it will make creating the instantiation files kind of a pain, since I have to split up data that is logically contiguous into 15 or 20 files to get the width down to 16 bits.
0 Kudos
Altera_Forum
Honored Contributor II
694 Views

Its not really clear from your description whats going on. Perhaps a description of the project goals and maybe some code would help?

0 Kudos
Altera_Forum
Honored Contributor II
694 Views

I'm trying to keep things general so as to not get tied up in code questions. I'm not really concerned with syntax or anything, but the basic architecture, is that I have a bunch of static, fixed data, split into 237 bit chunks. I'm accepting input from the outside world. Each input has to be checked against every 237 bit chunk doing some relatively straightforward linear algebra. Each of these tests requires something like 24 multipliers, so I can't implement as many functional units as I have 237 bit chunks onto the FPGA. So my goal is to implement as many as I can, and then when input comes in, do the computation, then load in the next 237 bits and check the same input against it, etc., until all 237 bit chunks have been checked. This is all unimportant, but my point is, at compile time I have a large number of 237 bit chunks of data. This data is fixed, static, and not changing. My question is is there an easy way to load 237 bits at a time from one of the memory systems, or if I'll need to do this over a number of cycles 16 bits at a time, or if there's a better way to do this. 

 

Thanks again!
0 Kudos
Altera_Forum
Honored Contributor II
694 Views

The problems you are having sound like what Tricky said in the first response, where Quartus isn't mapping your HDL code to M4K elements, it's mapping it to LE's. Did you write the code yourself, or did you use one of the templates or the recommended coding guidelines? 

 

If you use MegaWizard to create a 256-bit ROM and just connect the 237-bits you care aobut, you'll get zero LE used.
0 Kudos
Altera_Forum
Honored Contributor II
694 Views

I was really trying to understand the architecture. Ie. What interface after you using for the data transfer? What is the data? What is the end goal? 

 

From your description, it sounds highly inefficient. You would usually try and avoid ever doing what you're doing.. Load data, process data, save data... It all sounds a bit like software to me, and not how you would architect a system. This I'd why I asked for the architecture and the code, so I can better understand the problem.
0 Kudos
Reply