I have a problem with allocation.
I need to allocate on GPU a matrix whose total size is greater then the max allocable size in a single buffer.
I solved by allocating the matrix by means of malloc_shared() function but I was wondering if it is possible
to do the same with sycl buffers.
- General Support
I assume that your GPU device is an iGPU(Integrated GPU).
A point to note here is that iGPU shares most of its memory with the host device.
As per OpenCL/SYCL standard, memory allocation on a device cannot exceed its maximum allocatable memory. Since buffer/accessor memory allocation is a part of SYCL standard, memory allocation cannot exceed device's maximum allocatable memory for a single data structure.
Intel has added its own extensions on top of SYCL, known as "Unified shared memory(USM)". Using USM(malloc_shared()), the actual memory allocation takes place on the host and is shared between the host and the device. Hence it is possible to allocate memory that exceeds device's maximum allocatable memory(with limit being total available host memory).
Here's a definition from the DPC++ book for shared allocations:
Shared allocations are allocations that are accessible on both the host and the device. In this regard they are very similar to host allocations, but they differ in that data can now migrate between host memory and device-local memory. This means that accesses on a device, after the migration has occurred, happen from much faster device local memory instead of remotely accessing host memory. Typically, this is accomplished through mechanisms inside the DPC++ runtime and lower-level drivers that are mostly hidden from the programmer.
Hope this helps.