Intel® oneAPI Data Parallel C++
Support for Intel® oneAPI DPC++ Compiler, Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and GDB*
544 Discussions

Iris_xe_max nodes hang on dpcpp/sycl execution

Samantha3077
Beginner
1,457 Views

I am trying to run code written with sycl, using the Level Zero backend, and compiled with dpcpp. I wrote a very simple kernel to test and make sure that the node could execute code, and it constantly hangs and never exits. Furthermore, the process enters the "D" state so it cannot be stopped by any kill signals.

 

Upon execution, the program prints: "Driver does not support the 0x4905 PCI ID" before hanging and going into the "D" state. I've tried on a few iris nodes and I get the same issue every time.

 

My script is below, but the issue doesn't appear to be unique to this code.

#include <iostream>
#include <CL/sycl.hpp>

#define N 512

void test_kernel(sycl::id<1idsycl::accessor<int1sycl::access::mode::readinsycl::accessor<int1sycl::access::mode::writeout) {
    unsigned i = id[0];
    out[i] = 2 * in[i];
}

int main(void) {
    int a[N];
    int b[N];
    for (int i = 0; i < N; i++) {
        a[i] = i;
    }
    auto queue = sycl::queue(sycl::gpu_selector());

    std::cout << "starting kernel" << std::endl;
    {
        sycl::buffer<int1a_buff((int *) a, sycl::range(N));
        sycl::buffer<int1b_buff((int *) b, sycl::range(N));
        queue.submit([&](sycl::handler& cgh) {
            auto a_acc = a_buff.get_access<sycl::access::mode::read>(cgh);
            auto b_acc = b_buff.get_access<sycl::access::mode::write>(cgh);
            cgh.parallel_for<class Tester>(sycl::range(N), [=](sycl::item<1item) {
                test_kernel(item.get_id(), a_acc, b_acc);
            });
        });
    }
    queue.wait_and_throw();
    std::cout << "done" << std::endl;
Labels (1)
0 Kudos
5 Replies
Gopika_Intel
Moderator
1,420 Views

Hi,

Thank you for posting in Intel Communities. We are checking this with the internal team, we will get back to you as soon as we get an update.

Regards

Gopika


0 Kudos
Gopika_Intel
Moderator
1,397 Views

Hi,

We are trying to reproduce the issue from our end. Meanwhile, could you please try running the same in gen9 node as we were able to execute the program in gen9 without errors. To request gen9 node, please execute the following command:

qsub -I -l nodes=1:gen9:ppn=2 -d .

Hope this helps

Regards

Gopika

 

0 Kudos
Johny_P_Intel
Employee
1,284 Views

Hi Samantha,


I am taking ownership of this issue and we are trying to reproduce it on DevCloud for further debugging. In the meantime, let me know if the suggestion above to try Gen9 was helpful.


Regards,

Johny.



0 Kudos
Johny_P_Intel
Employee
1,218 Views

Hi Samantha,


I can confirm that this issue has been resolved and the application execute without getting hung. Will you be able to try this on DevCloud again for confirmation?


Regards,

Johny



0 Kudos
Gopika_Intel
Moderator
1,196 Views

Hi,

We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.

Regards

Gopika


0 Kudos
Reply