- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This post is basically a duplicate yet condensed version of all that happened on this post on the codeplay forum.
I have an issue where my SYCL kernels hangs and never finishes after it’s been computing for some time (~10s). This only happens when sycl::gpu_selector_v is selected for the sycl::queue command queue. Using sycl::cpu_selector_v shows no issues.
The kernel only hangs when there is a lot of computations to be done (see the details about LOOP_ITERATION and N below). One detail that might be relevant is that at first, just after launching the program, my laptop is barely usable due to the integrated GPU being used at its maximum. After a few seconds though, my laptop becomes usable again (as if it weren’t computing anything anymore) but the program still is running (and will never stop). At that point, my CPU will show a usage of ~50% (when using sycl::gpu_selector_v) until I decide to manually stop the program.
I managed to reproduce this issue on a simple example:
#include <sycl/sycl.hpp>
#include <vector>
#define LOOP_ITERATION 10000000
#define N 1000000
int main()
{
std::vector<float> v(N);
sycl::queue q{sycl::gpu_selector_v};
sycl::buffer buf{v};
q.submit([&](sycl::handler& cgh)
{
auto acc {buf.get_access(cgh,sycl::read_write)};
cgh.parallel_for(N, [=](sycl::id<1> id)
{
float x = 0.0f;
for (int i = 0; i < LOOP_ITERATION; i++)
x += i / 2;
acc[id] = x;
});
}).wait();
std::cout << "Done!" << std::endl;
return 0;
}
With the code posted above, I can only get the kernel to hang when N * LOOP_ITERATION is > 1 000 000 * 10 000 000. However, the kernel can still hang with lower LOOP_ITERATION (or N) values if we increase the complexity of the code inside the for (int i = 0; i < LOOP_ITERATION; i++) loop:
#define LOOP_ITERATION (10000000 / 10) //10 times less iterations
#define N 1000000
cgh.parallel_for(N, [=](sycl::id<1> id)
{
float x = 0.0f;
for (int i = 0; i < LOOP_ITERATION; i++)
{
x += i / 2;
float cosine = sycl::cos(sycl::sqrt(x));
float sine = sycl::sin(x);
float length = sycl::sqrt(cosine * cosine + sine * sine);
x /= sycl::cos(length) * sycl::sin(length);
}
acc[id] = x;
});
With LOOP_ITERATION divided by 10, the kernel never hangs unless the LOOP_ITERATION loop becomes more computationally demanding.
Sometimes but not always, this runtime error is thrown:
terminate called after throwing an instance of 'sycl::_V1::runtime_error'
what(): Native API failed. Native API returns: -14 (PI_ERROR_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST) -14 (PI_ERROR_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST)
I tried disabling the GPU hangcheck using this guide but to no avail. I also tried writing N to /sys/module/i915/parameters/enable_hangcheck and the solutions proposed here but none of this changed anything either…
For my immediate use case (ray tracing application where rendering an image with too many samples takes too long and hangs the kernel), there is a way for me to get around this issue by calling multiple smaller kernels (smaller amount of samples) but this isn't really a satisfactory solution, more like a workaround.
If relevant, I attached the output of the execution of the `clinfo` command on my system.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for posting on Intel communities.
We are working on it internally. We will get back to you soon.
Thanks & Regards,
Vankudothu Vaishnavi.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for your patience and understanding.
Could you please try upgrading the GPU driver? A new version of oneAPI 2024.0, will be available by the end of November. It would be great if you could test the new version and let us know if the issue remains the same.
Thanks & Regards,
Vankudothu Vaishnavi.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
How can I upgrade my drivers? I'm not sure how to do it on Ubuntu 20.04. I tried adding a PPA following this post but it didn't change anything regarding my kernel execution.
Tom Clabault.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Could you please try following the steps mentioned in the below link,
https://dgpu-docs.intel.com/driver/installation.html
And try with the new oneAPI 2024.0.0 version which is yet to be released this month.
Thanks & Regards,
Vankudothu Vaishnavi.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I followed these installation instructions (for Intel client GPUs) on Ubuntu 20.04 which I am running but my kernel still hangs.
So I just have to wait for the oneAPI version 2024.0.0 to be released then?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
The latest version, oneAPI 2024.0.0, is now accessible. Kindly download the Intel basekit by visiting the following link:
https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html
After downloading, test it with the latest version and let us know if you are still facing the same issue.
Thanks & Regards,
Vankudothu Vaishnavi.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I upgraded to oneAPI 2024.0.0 but even after recompiling an example that was hanging before the upgrade, it stills hangs after the upgrade.
I'm afraid the upgrade to 2024.0.0 didn't solve my issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for testing it on the latest oneAPI version(2024.0.0).
Could you please provide us with the following details?
- Output of sycl-ls
- Clinfo after upgrading the graphics driver.
- Please share us your output screenshot after executing the code.
Thanks & Regards,
Vankudothu Vaishnavi.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Could you please share additional information with us?
Please set SYCL_PI_TRACE=2 and capture the output logs. You can find details about SYCL_PI_TRACE options at https://github.com/intel/llvm/blob/sycl/sycl/doc/EnvironmentVariables.md#sycl_pi_trace-options.
This would greatly help us.
Thanks & Regards,
Vankudothu Vaishnavi.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Here is a pastebin of the output I get when running
SYCL_PI_TRACE=2 ./test_hang_2024.0
The last line of the Pastebin:
UR <--- UrQueue->executeAllOpenCommandLists()(UR_RESULT_SUCCESS)
is the output line I get before the program hangs. No more output is generated afterwards but the program is still running, hung up.
Tom
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We are working on your issue internally. We'll get back to you soon.
Thanks & Regards,
Vankudothu Vaishnavi.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have not heard back from you, could you please give me an update?
Tom

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page