- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am doing an OpenCL project of vector multiplication of VecA (M * 1) and VecB (1 * N) which produces a matrix MatC (M * N). I want to use a fan-out design which can support a 2-D processing engine array. Can I go like this to infer fan-our? :
https://alteraforum.com/forum/attachment.php?attachmentid=14222&stc=1 __kernel void matMult() { ...... # pragma unroll for(int x = 0; x < M; x++) {# pragma unroll for(int y = 0; y < N; y++) { MatC[x][y] += VecA[x] * VecB[y]; } } ...... } Any advice would be much appreciated!!Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can achieve this type of systolic array design using the autorun kernel type and num_compute_units (Section 2.3 and 2.4 of Intel FPGA SDK for OpenCL Programming Guide). However, I would expect the same thing to be also achievable in a single kernel using loop unrolling, where the local memory buffers are automatically replicated by the compiler.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, thanks for your reply! Do you know any OpenCL systolic array design examples? (with code)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There are some small code snippets in Altera's documents in the sections I mentioned above, but other than that, I do not know of any other public code showing the systolic array design.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I tried the systolic array and it takes massive amount of BRAM and registesr (mostly for control overhead) which causes my design to be severely memory-bounded. But if I do the fan-out design, the way I unroll the loop cannot work out, it produces wrong output in hardware run. Do you have any idea how the loops should be unrolled?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
--- Quote Start --- There are some small code snippets in Altera's documents in the sections I mentioned above, but other than that, I do not know of any other public code showing the systolic array design. --- Quote End --- Intel's FPGA systolic array example is a controlled material(using public code may not able to get best performance as not optimized for FPGA), and in the event user wish to have a copy that need to contact Altera representative separately. Regards, CloseCL (This message was posted on behalf of Intel Corporation)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Sir/madam,
may I ask who should I contact if I would like to request for a copy? Regards, Lancer Chiang- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Lancer,
You can contact our sales/FAE as NDA is required. Thanks, Regards, CloseCL (This message was posted on behalf of Intel Corporation)- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sir/Madam,
Many thanks! Is the copy an OpenCL implementation? Regards, Lancer Chiang- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do you know any OpenCL systolic array design examples? (with code)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page