- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello! I have restarted some of my experiments on the Intel Haswell processor and some of them stopped working, namely the ones related to examples meant to be executed for the GPU.
My main question is if I use clCreateBuffer with the flag CL_MEM_USE_HOST_PTR what does that flag actually do:
1. i create the array on the host, the gpu will use the same address as the address that was given to the array allocated on the host; if this thing happens then theoretically I will be able to compute something on the GPU and at the kernel termination point (also known as a synchronization point), the data written by the GPU should be inside the allocate region from the host?
2. i create the vector on the CPU host, after which there is a secondary memory location allocated when invoked clCreateBuffer, even though I am using CL_MEM_USE_HOST_PTR, and the data is inherently copied between the host and the device memory allocations in memory.
The reason i am asking is mainly because of the L3 cache which is shared between the CPU and the GPU. If both operate on the same address then in the L3 cache the data can be seen by both. Therefore a cooperation between the two may be possible
Thanks
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Thom,
We have specific OpenCL forum named 'Intel® SDK for OpenCL* Applications' where you will be answered.
Thank you.
--
QIAOMIN.Q
Intel Developer Support
Please participate in our redesigned community support web site:
User forums: http://software.intel.com/en-us/forums/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
On newer generation of Intel platforms the host (CPU) and device (GPU) share the same physical memory. So when you create an OpenCL buffer with CL_MEM_USE_HOST_PTR there is no copy involved (no CPU->GPU transfer), so yes (1) above is true. You can create the buffer on the CPU modify it on the GPU and then consume the result on the CPU (though you have to follow OpenCL specification and use map/unmap when you access the buffer back on the CPU).
Raghu

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page