OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
公告
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.

Blocking clEnqueueWriteBuffer returns before value is written

Jakub_S_
初学者
2,148 次查看

Hi,

I've encountered a problem when using OpenCL and Intel CPU:

When I do a blocking write (using clEnqueueWriteBuffer) to a buffer b using command queue q1, and right after that I enqueue a blocking read (using clEnqueueRead Buffer) from buffer b using a different command queue q2, what I read is not what I'v just written (some garbage or previous value). I was writing and reading one integer value (4 bytes). I can read correct value if I wait on event associated with clEnqueueWriteBuffer operation, or if I perform clFinish() on q1 after clEnqueueWriteBuffer. Also this problem does not occur when I use one commend queue.

This problem does not occur on Intel iGPU I have, AMD platform (both CPU and dGPU), NVIDIA platform (GPU).

Environment: 
* i7 6700K (but also Intel(R) Xeon(R) CPU E5-2680 v3)
* latest Intel SDK and latest drivers, but older drivers behaves the same.
0 项奖励
1 解答
Michal_M_Intel
2,148 次查看

This behaviour is actually in line with the spec ( though it is very very confusing ).

Blocking in terms of writeBuffer operation guards the "ptr" not the buffer update operation, that's why to make sure buffer is updated additional synchronization is needed ( through event or clFinish ).

Here is a qoute from the spec:

If blocking_write is CL_TRUE, the OpenCL implementation copies the data referred to by ptr and enqueues the write operation in the command-queue.The memory pointed to by ptr can be reused by the application after the clEnqueueWriteBuffer call returns.

Blocking EnqueueWriteBuffer may actually be split into 2 steps:

- allocate temporary memory, copy the data under the "ptr", return from the blocking call.

- copy the data from temporary memory into the buffer. ( this is done asynchronously )

Other way to handle this is to actually perform the EnqueueWriteBuffer operation is to perform the actual data copy before leaving the call, that is what GPU driver does.

 

 

在原帖中查看解决方案

0 项奖励
2 回复数
Michal_M_Intel
2,149 次查看

This behaviour is actually in line with the spec ( though it is very very confusing ).

Blocking in terms of writeBuffer operation guards the "ptr" not the buffer update operation, that's why to make sure buffer is updated additional synchronization is needed ( through event or clFinish ).

Here is a qoute from the spec:

If blocking_write is CL_TRUE, the OpenCL implementation copies the data referred to by ptr and enqueues the write operation in the command-queue.The memory pointed to by ptr can be reused by the application after the clEnqueueWriteBuffer call returns.

Blocking EnqueueWriteBuffer may actually be split into 2 steps:

- allocate temporary memory, copy the data under the "ptr", return from the blocking call.

- copy the data from temporary memory into the buffer. ( this is done asynchronously )

Other way to handle this is to actually perform the EnqueueWriteBuffer operation is to perform the actual data copy before leaving the call, that is what GPU driver does.

 

 

0 项奖励
Jakub_S_
初学者
2,148 次查看

You're right. Thanks.

0 项奖励
回复