Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

_GFX_offload weird behaviour

Mathieu_D
Beginner
331 Views

Hi,

I'm targeting Intel Graphics Technology with the API-Based offloading for asynchronous offloading. To begin, I try to offload this algorithm :

for (int i = 0; i < size; i++){
  A = i;
}

So I wrote this code :

__declspec(target(gfx_kernel))
void fill(int * A, int size){
  _Cilk_for(int i = 0; i < size; i++){
    A = i;
  }
}

int main() {
  int N = 1024;
  int * A = malloc(sizeof(int) * N);

  _GFX_share(A,N);

  _GFX_offload((void*)fill, A, N);
  _GFX_wait(0,-1);

  _GFX_unshare(A);
  free(A)

  return 0;
}

This code compiles and executes, but only the 780 firsts elements of A are effectively changed. I guess that's because of the max value of groups and threads but the number seems weird to me (_GFX_get_device_hardware_thread_count() returns 336).

So I have two questions : why 780 ? and how can I write a kernel that I can call with

_GFX_offload((void *)fill, A, N);

that does what I want it to do ?

Thanks, and have a nice day

Mathieu

0 Kudos
1 Solution
Konstantin_B_Intel
331 Views

_GFX_share accepts bytes count, not element. So you should have written

_GFX_share(A,sizeof(int)*N);

View solution in original post

0 Kudos
1 Reply
Konstantin_B_Intel
332 Views

_GFX_share accepts bytes count, not element. So you should have written

_GFX_share(A,sizeof(int)*N);

0 Kudos
Reply