What USM access is possible without usm_atomic_host/shared_allocations

MikeDB · ‎06-17-2024

Hi,

I'm working on the 11th gen integrated Iris Xe GPU. This GPU does *not* advertise the aspects aspect::usm_atomic_host_allocations or aspect::usm_atomic_shared_allocations.

Does this mean that any concurrent access at all from CPU and GPU to USM shared or host memory is unsafe, unportable or could crash?

Is it even safe to do concurrent atomic access between different SYCL kernel work-items to USM host or shared memory ?

Or does it mean more narrowly that the specific atomic and fence guarantees in the SYCL 2020 spec are not supported - https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#_interaction_with_host_code

One possible method for a workgroup to flag that processing has finished for that workgroup's data is finished and ready to access from the CPU would be to sychronize all the workgroup work-items using a barrier and then write a flag in the output data to show that output is complete before the workgroup finishes processing.

output_data = sycl::malloc_shared(); // output data for kernel with separate ranges for each workgroup
// zero initialise output_data before kernel runs

const auto e = q.parallel_for ( sycl::nd_range { global_range, local_range },
[ = ] ( sycl::nd_item<2> idx )
{
// write N results in output_data from workgroup_offset

idx.get_group().barrier();

if (idx.get_group().leader())

{
// flag to CPU that this workgroup has finished writing its results
output_data[workgroup_offset + N] = 1;

}

} );

Is this likely to crash or fail to guarantee the output_data results are visible to the CPU if the flag is set and cause race conditions?

Alex_Y_Intel · ‎06-26-2024

It's true that 11th generation integrated Iris Xe GPU does not support the SYCL aspects "aspect::usm_atomic_host_allocation" and "aspect::usm_atomic_shared_allocations," so there are indeed potential limiations or incompatibility when using these aspects. Please check https://github.khronos.org/SYCL_Reference/iface/interaction-with-host-code.html and https://github.com/IntelPython/dpctl/issues/309