Intel® oneAPI Data Parallel C++
Support for Intel® oneAPI DPC++ Compiler, Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and GDB*
Announcements
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.

Segfault when run with multigpu

sjunior
Beginner
467 Views

Hi all,
A receive this error, when I try run my code with two distinct GPUs:

Abort was called at 1632 line in file
/opt/src/vpg-compute-neo/level_zero/core/source/cmdlist/cmdlist_hw.inl
Command terminated by signal 6

I run this code on DevCloud.
When I run this code chosen CPU and GPU it runs sucefully.

You can see my code on this GitHub
https://github.com/sncimatec/rtm-domain-division

0 Kudos
7 Replies
VarshaS_Intel
Moderator
440 Views

Hi,


Thanks for posting in Intel Communities.


>>I try run my code with two distinct GPUs

Could you please provide us with the two distinct GPU details you are using?


Could you please let us know the Intel Compiler and its version? Could you please provide us with the steps to reproduce the issue at our end?


Thanks & Regards,

Varsha



sjunior
Beginner
434 Views

Hi @VarshaS_Intel , 

>> Could you please provide us with the two distinct GPU details you are using?
I run this code on s013-n001 node in devcloud. 
When I run "sycl-ls" I can see these information: 

u134150@s013-n001:~$ sycl-ls
[opencl:0] ACC : Intel(R) FPGA Emulation Platform for OpenCL(TM) 1.2 [2021.13.11.0.23_160000]
[opencl:0] CPU : Intel(R) OpenCL 3.0 [2021.13.11.0.23_160000]
[level_zero:0] GPU : Intel(R) Level-Zero 1.1 [1.1.20495]
[level_zero:1] GPU : Intel(R) Level-Zero 1.1 [1.1.20495]

And when I run clinfo, the device name is: Intel(R) Graphics [0x020a]

>> Could you please let us know the Intel Compiler and its version? Could you please provide us with the steps to reproduce the issue at our end?

The compiler version is: 

u134150@s013-n001:~$ dpcpp --version
Intel(R) oneAPI DPC++/C++ Compiler 2022.0.0 (2022.0.0.20211123)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /nda/development-tools/versions/oneapi/2022.1.0.nda/oneapi/compiler/2022.0.1-prerelease/linux/bin-llvm

 

And to reproduce this issue, you can follow these steps: 

git clone https://github.com/sncimatec/rtm-domain-division

git checkout gpu

cd rtm-domain-division/lib/cwp 

vim install.sh (Comment line 5, because devcloud doesn't have X11 package)

sh install.sh 

cd ../../build/3lay_mod 

sh run.sh 

 

These are all instructions to reproduce the error which I pointed out. 

Thanks for the help =D

 

 

 

VarshaS_Intel
Moderator
410 Views

Hi,


Thanks for providing the information.


Could you please provide us with the output when you are able to run without any errors?


Thanks & Regards,

Varsha



sjunior
Beginner
402 Views

Hi @VarshaS_Intel

Before Anything, correcting the information for executing the code, I forgot one step, I will put the correct flow next: 

git clone https://github.com/sncimatec/rtm-domain-division

git checkout gpu

cd rtm-domain-division/lib/cwp 

vim install.sh (Comment line 5, because devcloud doesn't have X11 package)

sh install.sh

cd ../../src/ 

make 

cd ../build/3lay_mod 

../mod_main par=input.dat (This is necessary to create the input data for we can run the next script)

sh run.sh 

When the code run totally, this generates an inner folder one file: 

dir.image 

This file has 90 KB, this is an image, and needs X11 to show. 

 

VarshaS_Intel
Moderator
336 Views

Hi,


Thanks for providing the information.


Could you please let us know how you are trying to run the code on two different GPUs?  


Could you please let us know after running which step/command you are getting the error mentioned in the original post? And also, could you please confirm whether you are facing an issue only when running in this particular node "s013-n001" in NDA Devcloud? 


Thanks & Regards,

Varsha



VarshaS_Intel
Moderator
319 Views

Hi,


We have not heard back from you. Could you please provide us with the details mentioned in the previous reply?


Thanks & Regards,

Varsha


VarshaS_Intel
Moderator
284 Views

Hi,


We have not heard back from you. This thread will no longer be monitored by Intel. If you need additional information, please post a new question.


Thanks & Regards,

Varsha


Reply