Matrix Multiplication via ND-Range and USM

SimonWang · ‎02-09-2021

// data allocation
int* w_device = malloc_device<int>(M * N, Q);
int* b_device = malloc_device<int>(M, Q);
int* temp = malloc_device<int>(SIZE, Q);
int* x_shared = malloc_shared<int>(N * K, Q);
int* result_shared = malloc_shared<int>(M * K, Q);

//two events: copy data to device
auto e1 = ... cgh.memcpy(w_device...)..;
auto e2 = ..cgh.memcpy(b_device...)...;

// compute graph dependencies
cgh.depends_on({e1, e2});

// two for-loops in kernel function
// dimension: nd_range<2>{{M, K}, {1, GROUP_SIZE}}
int m = item.get_global_id(0);
int n = item.get_global_id(1);
int i = item.get_local_id(1);
//first
for (int k = 0; k < N; k += GROUP_SIZE) {
  temp[i] = w_device[m][k +i];
  item.barrier();
  for(int kk = 0; kk < GROUP_SIZE; kk++) {
    sum += temp[kk] * x_shared[(k + kk) * K + n];
    item.barrier();
  }
}

//second
for (int k = 0; k < N; k++) {
  sum += w_device[m * N + k] * x_shared[k * K + n];
}
//
result[m * K + n] = sum + b_device[m];

I was trying to understand how to use SYCL, and decided to write matrix multiplication from scratch. However, I have some questions when testing my codes. I am not sure whether these problems were bugs, or I made some mistakes, so I posted the key part of my codes.

Using the first for-loop, the result occasionally gave wrong elements (not all, just part of it)., while the second for-loop will never give wrong result.

if I did not make mistakes, I guess there are some problems when using item.barrier() to synchronize data accessed by USM?

RahulV_intel · ‎02-11-2021

Hi,

Could you please share the complete source code if possible?

Also, specify your oneAPI base toolkit version and OS.

Thanks,

Rahul

RahulV_intel · ‎02-18-2021

Hi @SimonWang,

Just a quick reminder to share your complete source code.

Thanks,

Rahul

RahulV_intel · ‎02-25-2021

Hi,

I have not heard back from you. So, I will go ahead and close this thread from my end. Feel free to post a new query if you require further assistance from Intel.

Thanks,

Rahul