Re: How to infer BRAM fan-out

Altera_Forum · ‎10-24-2017

I am doing an OpenCL project of vector multiplication of VecA (M * 1) and VecB (1 * N) which produces a matrix MatC (M * N). I want to use a fan-out design which can support a 2-D processing engine array. Can I go like this to infer fan-our? :

https://alteraforum.com/forum/attachment.php?attachmentid=14222&stc=1

__kernel

void matMult() {

......

# pragma unroll

for(int x = 0; x < M; x++) {# pragma unroll

for(int y = 0; y < N; y++) {

MatC[x][y] += VecA[x] * VecB[y];

}

......

}

Any advice would be much appreciated!!

Altera_Forum · ‎10-24-2017

You can achieve this type of systolic array design using the autorun kernel type and num_compute_units (Section 2.3 and 2.4 of Intel FPGA SDK for OpenCL Programming Guide). However, I would expect the same thing to be also achievable in a single kernel using loop unrolling, where the local memory buffers are automatically replicated by the compiler.

Altera_Forum · ‎10-25-2017

Hi, thanks for your reply! Do you know any OpenCL systolic array design examples? (with code)

Altera_Forum · ‎10-25-2017

There are some small code snippets in Altera's documents in the sections I mentioned above, but other than that, I do not know of any other public code showing the systolic array design.

Altera_Forum · ‎10-27-2017

Hi, I tried the systolic array and it takes massive amount of BRAM and registesr (mostly for control overhead) which causes my design to be severely memory-bounded. But if I do the fan-out design, the way I unroll the loop cannot work out, it produces wrong output in hardware run. Do you have any idea how the loops should be unrolled?

Altera_Forum · ‎10-27-2017

--- Quote Start ---

There are some small code snippets in Altera's documents in the sections I mentioned above, but other than that, I do not know of any other public code showing the systolic array design.

--- Quote End ---

Intel's FPGA systolic array example is a controlled material(using public code may not able to get best performance as not optimized for FPGA), and in the event user wish to have a copy that need to contact Altera representative separately.

Regards,

CloseCL

(This message was posted on behalf of Intel Corporation)

Altera_Forum · ‎10-27-2017

Hi, Sir/madam,

may I ask who should I contact if I would like to request for a copy?

Regards,

Lancer Chiang

Altera_Forum · ‎10-27-2017

Hi Lancer,

You can contact our sales/FAE as NDA is required.

Thanks,

Regards,

CloseCL

(This message was posted on behalf of Intel Corporation)

Altera_Forum · ‎10-27-2017

Hi Sir/Madam,

Many thanks! Is the copy an OpenCL implementation?

Regards,

Lancer Chiang

Altera_Forum · ‎10-31-2017

Do you know any OpenCL systolic array design examples? (with code)