OpenCL Matrix Multiplication

Altera_Forum · ‎12-16-2015

Hi,

I was going through the Matrix multiplication Altera example. I couldn't entirely understand the block concept that they have used especially the following lines of code:

A_local[local_y][local_x] = A[a + A_width * local_y + local_x];

B_local[local_x][local_y] = B[b + B_width * local_y + local_x];

These are local storage for a block of input matrices A and B. Let's say block size is 2 and local_x and local_y are both 0. In that case, A_local[0][0] will be A[0] and B_local[0][0] will be B[0] (assuming a and b are also 0). In that case, these local block matrices A_local and B_local will only have one element and not one. This might be silly question but I didn't find any explanation about it online.

Also would separate threads/core be allotted for each local_x and local_y? I assume it would be the case for all work-groups.

Let me know if you want me to post the whole code (it's available on altera website).