OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1663 Discussions

Blocking clEnqueueWriteBuffer returns before value is written

Jakub_S_
Beginner
523 Views

Hi,

I've encountered a problem when using OpenCL and Intel CPU:

When I do a blocking write (using clEnqueueWriteBuffer) to a buffer b using command queue q1, and right after that I enqueue a blocking read (using clEnqueueRead Buffer) from buffer b using a different command queue q2, what I read is not what I'v just written (some garbage or previous value). I was writing and reading one integer value (4 bytes). I can read correct value if I wait on event associated with clEnqueueWriteBuffer operation, or if I perform clFinish() on q1 after clEnqueueWriteBuffer. Also this problem does not occur when I use one commend queue.

This problem does not occur on Intel iGPU I have, AMD platform (both CPU and dGPU), NVIDIA platform (GPU).

Environment: 
* i7 6700K (but also Intel(R) Xeon(R) CPU E5-2680 v3)
* latest Intel SDK and latest drivers, but older drivers behaves the same.
0 Kudos
1 Solution
Michal_M_Intel
Employee
523 Views

This behaviour is actually in line with the spec ( though it is very very confusing ).

Blocking in terms of writeBuffer operation guards the "ptr" not the buffer update operation, that's why to make sure buffer is updated additional synchronization is needed ( through event or clFinish ).

Here is a qoute from the spec:

If blocking_write is CL_TRUE, the OpenCL implementation copies the data referred to by ptr and enqueues the write operation in the command-queue.The memory pointed to by ptr can be reused by the application after the clEnqueueWriteBuffer call returns.

Blocking EnqueueWriteBuffer may actually be split into 2 steps:

- allocate temporary memory, copy the data under the "ptr", return from the blocking call.

- copy the data from temporary memory into the buffer. ( this is done asynchronously )

Other way to handle this is to actually perform the EnqueueWriteBuffer operation is to perform the actual data copy before leaving the call, that is what GPU driver does.

 

 

View solution in original post

2 Replies
Michal_M_Intel
Employee
524 Views

This behaviour is actually in line with the spec ( though it is very very confusing ).

Blocking in terms of writeBuffer operation guards the "ptr" not the buffer update operation, that's why to make sure buffer is updated additional synchronization is needed ( through event or clFinish ).

Here is a qoute from the spec:

If blocking_write is CL_TRUE, the OpenCL implementation copies the data referred to by ptr and enqueues the write operation in the command-queue.The memory pointed to by ptr can be reused by the application after the clEnqueueWriteBuffer call returns.

Blocking EnqueueWriteBuffer may actually be split into 2 steps:

- allocate temporary memory, copy the data under the "ptr", return from the blocking call.

- copy the data from temporary memory into the buffer. ( this is done asynchronously )

Other way to handle this is to actually perform the EnqueueWriteBuffer operation is to perform the actual data copy before leaving the call, that is what GPU driver does.

 

 

Jakub_S_
Beginner
523 Views

You're right. Thanks.

Reply