- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have been trying to perform panel-by-panel matrix multiplication at the block level. The pseudo algorithm is as follows:
__attribute__ ((reqd_work_group_size (1, 1, 1)))
__kernel void matmul_panel_fpga_cl (
_global float * a,
__global float * b,
__global float * c,
const int m,
const int n,
const int k,
const int num_of_m_blocks,
const int num_of_n_blocks
)
for (int a = 0; a <num_of_m_blocks; a + = M_STEP) {// num_of_m_blocks in panelA
for (int it = 0; it <M_STEP; it ++) {
pack_a_matrix ();
}
for (int bb = 0; bb <num_of_n_blocks; bb ++) {// num_of_n_blocks in panelB
pack_b_matrix ();
for (int ab = 0; ab <M_STEP; ab ++) {
pack_c_matrix ();
packed_matrix_multiply_c_a * b ();
return_pack_c ();
}
}
}
The above kernels work fine when the number of kernel invocation (in other words number of panels) is equal to the number of num_of_n_blocks. But when they are different then it returns garbage values in the packed_c. I have used the clFinish () every time But I do not understand how this has a relation of num_of_n_blocks to a number of kernel invocations.
I have been using OpenCL FPGA SDK 20.3
Please help us to understand.
- Tags:
- compiler
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @ash_apee,
Thank you for posting in Intel community forum and hope all is well.
Would recommend to refer to the design example below.
My guess is the looping structure written are incorrect, as per mention on the error/anamoly behaviour, please do provided us some screenshot of the behaviour for better understanding on the situation.
Here are also another good references point for matric multiplecation to refer to as below.
https://www.youtube.com/watch?v=dUET4OuXhL8
Hope that clarify.
Best Wishes
BB
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @ash_apee,
Greetings, just checking in to see if there is any further doubts in regards to this matter.
Hope we have clarify your doubts.
Best Wishes
BB
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi there,
Thank you for your reply. Actually, we do not understand why the FPGA openCL did not work while CPU openCL is working fine. So we have changed our code and it's working fine.
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @ash_apee,
Great! Good to know that you managed to overcome that, with no further clarification on this thread, it will be transitioned to community support for further help on doubts in this thread.
Thank you for the questions and as always pleasure having you here.
Best Wishes
BB

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page