I am trying to run code written with sycl, using the Level Zero backend, and compiled with dpcpp. I wrote a very simple kernel to test and make sure that the node could execute code, and it constantly hangs and never exits. Furthermore, the process enters the "D" state so it cannot be stopped by any kill signals.
Upon execution, the program prints: "Driver does not support the 0x4905 PCI ID" before hanging and going into the "D" state. I've tried on a few iris nodes and I get the same issue every time.
My script is below, but the issue doesn't appear to be unique to this code.
We are trying to reproduce the issue from our end. Meanwhile, could you please try running the same in gen9 node as we were able to execute the program in gen9 without errors. To request gen9 node, please execute the following command:
qsub -I -l nodes=1:gen9:ppn=2 -d .
Hope this helps
I am taking ownership of this issue and we are trying to reproduce it on DevCloud for further debugging. In the meantime, let me know if the suggestion above to try Gen9 was helpful.
I can confirm that this issue has been resolved and the application execute without getting hung. Will you be able to try this on DevCloud again for confirmation?