Hello! I have restarted some of my experiments on the Intel Haswell processor and some of them stopped working, namely the ones related to examples meant to be executed for the GPU.
My main question is if I use clCreateBuffer with the flag CL_MEM_USE_HOST_PTR what does that flag actually do:
1. i create the array on the host, the gpu will use the same address as the address that was given to the array allocated on the host; if this thing happens then theoretically I will be able to compute something on the GPU and at the kernel termination point (also known as a synchronization point), the data written by the GPU should be inside the allocate region from the host?
2. i create the vector on the CPU host, after which there is a secondary memory location allocated when invoked clCreateBuffer, even though I am using CL_MEM_USE_HOST_PTR, and the data is inherently copied between the host and the device memory allocations in memory.
The reason i am asking is mainly because of the L3 cache which is shared between the CPU and the GPU. If both operate on the same address then in the L3 cache the data can be seen by both. Therefore a cooperation between the two may be possible
We have specific OpenCL forum named 'Intel® SDK for OpenCL* Applications' where you will be answered.
Intel Developer Support
Please participate in our redesigned community support web site:
User forums: http://software.intel.com/en-us/forums/
On newer generation of Intel platforms the host (CPU) and device (GPU) share the same physical memory. So when you create an OpenCL buffer with CL_MEM_USE_HOST_PTR there is no copy involved (no CPU->GPU transfer), so yes (1) above is true. You can create the buffer on the CPU modify it on the GPU and then consume the result on the CPU (though you have to follow OpenCL specification and use map/unmap when you access the buffer back on the CPU).