Re: Matrix Multiplication optimization

Altera_Forum · ‎05-06-2015

Hello, I am using FPGA EP3C40Q240C8. I have written a code for Kalman filter in which I have to do matrix multiplication and matrix inverse. The problem is that there is no error in my code but Fitter is creating problems for me. the total combinational function of my program is exceeding the maximum combinational function that this device allows. As a result, fitter is unable to place the design. So, I need suggestions how to optimize my matrix code or how to overcome this problem.

Altera_Forum · ‎05-07-2015

Hi,

I've more questions than answers:

- Is your code functionally correct? Is it working in simulation ?

- Are you using large "for" loops ?

- Is your implementation pipelined at all, or outputs are giant combinatorial function of the inputs ?

- How many logic levels are reported by the fitter ?

Thanks,

Evgeni

Altera_Forum · ‎05-07-2015

What exactly is the fitter running out of? logic, routing resources, DSPs?

Why not post your code?

Altera_Forum · ‎05-07-2015

--- Quote Start ---

Hi,

I've more questions than answers:

- Is your code functionally correct? Is it working in simulation ?

- Are you using large "for" loops ?

- Is your implementation pipelined at all, or outputs are giant combinatorial function of the inputs ?

- How many logic levels are reported by the fitter ?

Thanks,

Evgeni

--- Quote End ---

I have not checked the code in simulation. However, first I have implemented this code in MATLAB and there, it is working fine.

Yes, i am using 'for' loop.

Can you please tell me what is 'pipeline implementation'?

the combinational function reported during anlaysis and synthesis are 132000. I have attached the snapshot as well. I can also upload the code if you want..

Altera_Forum · ‎05-07-2015

--- Quote Start ---

What exactly is the fitter running out of? logic, routing resources, DSPs?

Why not post your code?

--- Quote End ---

This is what i get during compilation..please see the attachment

Altera_Forum · ‎05-07-2015

TO_BE_DONE

Altera_Forum · ‎05-07-2015

I have deleted some part of my code due to space limitation while uploading but in that part, i am doing the matrix multiplication. I have tried to run this code step by step .i.e. commenting all the code for the first time and then uncomment it step by step to see which part is creating problem. while doing so, i have found that when i tried to save the updated value of calculation in already defined matrix, it gives me problem. I have mentioned those lines in "red font". when i don't include those lines in the code, everything works fine and analysis compilation takes only less than 10sec but when i include those two lines, analysis compilation takes app 40 mins. this is making me crazy. I just have to store the value in matrix so that i can use it in next clock cycle.

please give me solution or tell me where am i making mistake?

Altera_Forum · ‎05-07-2015

Please attach bigger images and project in QAR archive

Altera_Forum · ‎05-08-2015

Was this code generated from matlab using HDL coder or DSP builder?

The heavy use of variables suggests not, and I get the feeling even if the resource problem was solved, you would get a very very low FMax, as it is trying to do an entire matrix multiply in a single clock.

Altera_Forum · ‎05-08-2015

No i have written the code myself. I have not generated it from MATLAB. What is FMax? and what do you suggest how to do multiplication?

Altera_Forum · ‎05-08-2015

I highly suggest you get a VHDL textbook and read up on how to write VHDL for digital designs. The current code just looks like you're trying to write some software inside a single process. This will not give a very efficient design, as there is no pipelining, so you will only be able to run the design at a very low clock speed (low FMax - max frequency).

Altera_Forum · ‎05-08-2015

For a start, as a beginner, you should never use variables. Like the above code, they just give poor results (because unless you know what you're doing, you just end up with one cominatorial logic chain). Much better to stick with signals.

Altera_Forum · ‎05-08-2015

--- Quote Start ---

I highly suggest you get a VHDL textbook and read up on how to write VHDL for digital designs. The current code just looks like you're trying to write some software inside a single process. This will not give a very efficient design, as there is no pipelining, so you will only be able to run the design at a very low clock speed (low FMax - max frequency).

--- Quote End ---

Thanks for your suggestion. I will look for VHDL book. Can you please explain in a sentence or two about pipelining...what is it??? and why use of signal is much appreciated than use of variables????

Altera_Forum · ‎05-09-2015

Pipelining is the chain of registers that are inserted into your algorithm to reduce the logic between registers. It increases the latency, but massively increases the FMax. Latency is usually not important, it's the throughput that counts (and throughput increases with clock speed).

As for signals vs variables: signals are only updated when a process suspends - so in a clocked proecss every signal assignment will become a register. Variables are updated immediately, so the logic that is produces depends on the order of the assignments in the code.

There is nothing you can do with variables that you cannot do with signals. And as variables can be more unpredictable, usings signals is far safer for a beginner.

But, I highly recommend you take a step back. Do you have a drawing of how this algorithm will look on hardware? one that you drew before you wrote any code? HDL is a hardware description language - if you dont know what hardware you want, how do you expect to describe it?