OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1663 Discussions

which one to use clEnqueueWriteBuffer or clEnqueueMapBuffer ?

Hello all,
Can you pls suggest me how do I proceed after this.
[cpp]cl_image_format image; // set the image data type being used and the order image.image_channel_data_type = CL_FLOAT; image.image_channel_order = CL_RGBA; cl_mem srcimg, dstimg; // Create the 2D image and the destination buffer. srcimg = clCreateImage2D(context,CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, ℑ, 4, 4, sizeof(cl_float4)*4, input_data, &error); dstimg = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, sizeof(cl_float4)*4*4, output_data, &error);[/cpp]
Just for convenience I have taken a float array for input image of size 4x4. Assume that "input_data" is not NULL. output_data is of type float*. I haven't allocated any memory to this. I guess I should allocate memory for this using malloc. Correct me if I am wrong.
Should I useclEnqueueWriteBuffer or clEnqueueMapBuffer after these above statements. Pls explain..
0 Kudos
2 Replies
Yes, you need to allocate memory for output_data.

With clEnqueueWriteBuffer you are enqueing a command to write to a cl_mem buffer object from host memory. You would use clEnqueueMapBuffer() to map a region of a cl_mem object into the host memory. Once you are done executingthe kernel you can use clEnqueueUnMapBuffer() call to unmap the mapped region.

So at this point you need to set your kernel arguments, enqueue the kernel, map the output buffer, execute the kernel, and unmap the output buffer.


Thanks for such a succinct reply. I will try it out.

I have one more query. Say now I used the CL_MEM_USE_HOST_PTR in creating the 2D image, so this will copy nothing to the device, instead the GPU will take themapped memory fromclEnqueueMapBuffer, do the processing and we can writethe resultsto some other location.

On the other hand if I use the CL_MEM_COPY_HOST_PTR, it will create a copy of the data pointed to by host ptr on the device(I guess it will create a separatecopy not just caching). Now the processing will be done on the data that was copied to the device and then again the results are copied to host. I hope I am correct so far.

How about this.. Its just out of mycuriosity that I want to do it this way.I will use the CL_MEM_USE_HOST_PTR and now even though the device can access the host memory, I want the GPU kernel to create a separate copy onto the device(not using the COPY_HOST_PTR because this is again done in the host itself).How can this be done??