- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to verify that different available devices work consistently. Thus instead of using the default selector for the vector addition example I execute the same function on different devices. While it first seems to work correctly for FPGA emulation, Intel CPU and Intel GPU, then the Intel GPU is listed for the second time and gives an incorrect result (zero), Nvidia GPU also gives zero, and on Intel CPU the program crashes. In the attachment are the source, program output, gdb and valgrind output and device information listed by Computecpp implementation. The system is an up-to-date Ubuntu Eoan. Command to build: mkdir build && cd build && cmake .. -DCMAKE_CXX_COMPILER=`which dpcpp` -G Ninja -DCMAKE_BUILD_TYPE=Debug && cmake --build .
- Tags:
- General Support
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Regarding your first issue, we can see that you have different drivers for the same device (iGPU), from which one of the drivers doesn't support our toolkit this is the reason why it's not giving the correct result.
- Abhishek
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for reaching out to us.
We are working on it and will get back to you.
-Abhishek
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I gained access to the Intel DevCloud and modified that cmake file to work with an older version. The attached output on a node without a GPU is as expected.
On a node with a GPU the program is stuck when submitting the vector addition task, no output until the job is killed with qdel.
On a FPGA the result is incorrect (zero), while the FPGA emulation gives the correct result.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We tried running your code on our Devcloud and got the same output as you got.
One thing I want to tell you that while running our code on "Intel(R) FPGA SDK for OpenCL(TM)" we have to follow some more additional steps, for more details regarding FPGA you can refer OneAPIProgrammingGuide, but for other devices, the same flow of execution will give you the correct output.
We also got the correct output while running over iGPU so try to re-execute the code and you will get the correct result.
Currently, NVIDIA GPU is not supported by our toolkit soon you will find more updates on it.
The Attachment shows the correct output over Intel(R) FPGA SDK for OpenCL(TM), steps you can follow to run on Intel(R) FPGA SDK for OpenCL(TM) are:
- $ dpcpp -fintelfpga main.cpp -c -o mainfpga.o
- $ dpcpp -fintelfpga mainfpga.o -Xshardware
- $ ./a.out
You have to do all these while on the fpga_runtime node.
Get back to us if you face any issues.
- Abhishek
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Regarding your first issue, we can see that you have different drivers for the same device (iGPU), from which one of the drivers doesn't support our toolkit this is the reason why it's not giving the correct result.
- Abhishek
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Abhishek,
Thank you for the explanation. What is the correct way to replace the default iGPU driver? Nvidia was not a surprise, but what is the correct way to explicitly determine during execution if the binary-driver-device combination is supposed to work?
Will try a different GPU node a bit later.
- Dmitry.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dmitry,
You can include Asynchronous Exception Handler in your code to check whether your code is running on a particular device. And if you pass default_selector to the queue instead of passing platform id's at each execution, it will automatically select your default iGPU which has the maximum score value. You can embed the following code snippet into your code:
cl::sycl::queue queue(cl::sycl::default_selector{}, exception_handler);
-Abhishek
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
GPU hangs on the first node (job 448419.v-qsvr-1 right now) chosen if the queue is empty, but does complete successfully in 4 seconds on another node. Should I report it to the DevCloud-specific forum?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It actually didn't hang, it waits for the task to get assigned and if our queue is empty it will wait until and unless it gets some tasks to execute but after a threshold time it will automatically get killed so you need not have to think about it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dmitry,
I am closing this thread. We will make a note of your findings.
-Abhishek
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page