Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
32 Views

Trouble to pass structure

Hello,

I've got some problem. I write an motion estimation algorithm using OpenCL. I've got sophisticated structures, like this:

HOST SIDE:

typedef struct ctu_info_t ctu_info_t;
typedef struct cu_info_t cu_info_t;
typedef struct thread_t thread_t;
struct cu_info_t
{
  cl_int array[100];
  /*and so on*/
};

struct ctu_info_t
{
  cu_info_t cu_info;
  cl_int array_1[1000];
  cl_int array_2[2000];
  /*and so on*/
};

struct thread_t
{
  ctu_info_t ctu_rd;
  /*another fields*/
};

DEVICE SIDE:

typedef struct ctu_info_t ctu_info_t;
typedef struct cu_info_t cu_info_t;
typedef struct thread_t thread_t;
struct cu_info_t
{
  int array[100];
  /*and so on*/
};

struct ctu_info_t
{
  cu_info_t cu_info;
  int array_1[1000];
  int array_2[2000];
  /*and so on*/
};

struct thread_t
{
  ctu_info_t ctu_rd;
  /*another fields*/
};

I don't have any trouble with setting kernel's arguments (kernel func looks like this "__kernel void my_func(ctu_info_t *ctu_info, thread_t *thread) { /*here is the code...*/ }"), but when I call "clEnqueueNDRangeKernel()" I've got this:

1KUez26eo_w.jpg

Can anybody help to solve this trouble? I don't know what's wrong I did.

0 Kudos
2 Replies
Highlighted
Employee
32 Views

Hello,

Crash is coming from nvopencl.dll , this looks like Nvidia OpenCL driver. 

0 Kudos
Highlighted
32 Views

Afaik, opencl is best served with arrays and building structs inside devices. When structs need to be sent between host and device, both sides has to have exact same alignment and size for each struct defined.

Opencl version <= 1.2 does not handle this and leaves it to mercy of drivers and responsibility of developer.

Some of the rules I remember:

  • struct size = power of 2(adding necessary dummy variables)
  • biggest fields(inner structs, arrays, vectors) on top, smallest fields on bottom
  • struct array aligned on struct size address
  • packing host-side structs equally with what device is doing

 also using pure arrays gives best performance since you get only the needed fields from arrays instead of whole structs which are sub-optimally using memory banks.

0 Kudos