Intel® oneAPI Data Parallel C++
Support for Intel® oneAPI DPC++ Compiler, Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and GDB*
Announcements

## Matrix Multiplication via ND-Range and USM Beginner
471 Views

``````// data allocation
int* w_device = malloc_device<int>(M * N, Q);
int* b_device = malloc_device<int>(M, Q);
int* temp = malloc_device<int>(SIZE, Q);
int* x_shared = malloc_shared<int>(N * K, Q);
int* result_shared = malloc_shared<int>(M * K, Q);

//two events: copy data to device
auto e1 = ... cgh.memcpy(w_device...)..;
auto e2 = ..cgh.memcpy(b_device...)...;

// compute graph dependencies
cgh.depends_on({e1, e2});

// two for-loops in kernel function
// dimension: nd_range<2>{{M, K}, {1, GROUP_SIZE}}
int m = item.get_global_id(0);
int n = item.get_global_id(1);
int i = item.get_local_id(1);
//first
for (int k = 0; k < N; k += GROUP_SIZE) {
temp[i] = w_device[m][k +i];
item.barrier();
for(int kk = 0; kk < GROUP_SIZE; kk++) {
sum += temp[kk] * x_shared[(k + kk) * K + n];
item.barrier();
}
}

//second
for (int k = 0; k < N; k++) {
sum += w_device[m * N + k] * x_shared[k * K + n];
}
//
result[m * K + n] = sum + b_device[m];``````

I was trying to understand how to use SYCL, and decided to write matrix multiplication from scratch. However, I have some questions when testing my codes. I am not sure whether these problems were bugs, or I made some mistakes, so I posted the key part of my codes.

Using the first for-loop, the result occasionally gave wrong elements (not all, just part of it)., while the second for-loop will never give wrong result.

if I did not make mistakes, I guess there are some problems when using item.barrier() to synchronize data accessed by USM?

3 Replies Moderator
452 Views

Hi,

Could you please share the complete source code if possible?

Also, specify your oneAPI base toolkit version and OS.

Thanks,

Rahul Moderator
436 Views

Hi @SimonWang,

Just a quick reminder to share your complete source code.

Thanks,

Rahul Moderator
406 Views

Hi,

I have not heard back from you. So, I will go ahead and close this thread from my end. Feel free to post a new query if you require further assistance from Intel.

Thanks,

Rahul 