Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

oneMKL ScaLAPACK with gdb

GPU_AI
Novice
848 Views

Hello. 

 

I'm using oneMKL ScaLAPACK. 

I did some code works and manage to run it on my Linux machine with sungrid engine. 

The command line for compiling is : 

mpiicc FT_13.100.c -o FT_13.100.x -L/opt/intel/oneapi/mkl/2022.1.0/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_ilp64 -lpthread -lm -ldl -DMKL_ILP64,

after this I use qsub command to execute it on the computing nodes.

 

I have the free() : invalid pointer problem for a given parameter so I want to use a debugger. 

I tried to use Valgrind, but it doesn't looks to work properly. 

 

I want to use gdb to know where and why I get this error. 

Especially, can you let me know how to run gdb with submitting job through sungrid engine?

It looks like gdb in oneAPI is already installed in "/opt/intel/oneapi/debugger/2021.6.0/gdb/" and its subfolders.

 

Thanks. 

 

 

 

 

0 Kudos
6 Replies
VarshaS_Intel
Moderator
809 Views

Hi,

 

Thanks for posting in Intel Communities.

 

Could you please try using the below command:

 mpiicc -DMKL_ILP64 -I"${MKLROOT}/include" scalapack.cpp ${MKLROOT}/lib/intel64/libmkl_scalapack_ilp64.a -Wl,--start-group ${MKLROOT}/lib/intel64/libmkl_intel_ilp64.a ${MKLROOT}/lib/intel64/libmkl_sequential.a ${MKLROOT}/lib/intel64/libmkl_core.a ${MKLROOT}/lib/intel64/libmkl_blacs_intelmpi_ilp64.a -Wl,--end-group -lpthread -lm -ldl -g

mpirun -np 1 gdb-oneapi ./a.out

(or) you can use the  Intel Link Line Advisor for the exact command for your purpose.

If you want to debug your code you need to add the '-g' option to it. 

 

Please find the below screenshot where we have tried an MKL ScaLapack Example( /opt/intel/oneapi/mkl/2023.1.0/examples/c_mpi/scalapack/source/pcgetrf_example.c):

VarshaS_Intel_0-1685638486462.png

 

And also, could you please try and let us know if you are able to run? If not, could you please let us know the OS details, and processor and provide us with the sample reproducer code and steps you have followed to investigate your issue?

 

Thanks & Regards,

Varsha  

 

0 Kudos
GPU_AI
Novice
778 Views

Thanks for your reply.

First of all, it seems that you suggest running gdb with the mpirun command.

There are three points I want to make clear.

 

1) First of all, when I tried to do "mpirun -np 1 gdb-oneapi ./a.out" I got 

"[proxy:0:0@dollygo] HYD_spawn (../../../../../src/pm/i_hydra/libhydra/spawn/intel/hydra_spawn.c:151): execvp error on file gdb-oneapi (No such file or directory)"

Server OS : Rocky Linux release 8.5 (Green Obsidian)

Processor : Intel(R) Xeon(R) CPU X5660 @ 2.80GHz (processor for master node)

I think it is the issue of setting $PATH environemnt. I googled it but was not able to manage it. 


2) Can I do this with multiple processors (like -np 4) even for running gdb?
3) As I wrote in the question, I'd like to know if there is a way to run gdb with the qsub command.
Do you think that "qsub mpirun -np 4 gdb-oneapi ./a.out" will work properly?
I think the qsub command will submit my jobs to computing nodes in my Linux server.
(I hope to work with gdb on my master node as shown in the screenshot)

I want to have some advice from an expert before I try it by myself.

Thanks.

0 Kudos
GPU_AI
Novice
721 Views

After it, I searched up and tried


"mpirun -np 1 {gdbpath}/gdb-oneapi ./a.out" .

Now, I have a little trouble saying "libipt.so.2: cannot open shared object file: No such file or directory".

 

When I enter "ldd {gdbpath}/gdb-oneapi" I get,

 

linux-vdso.so.1 (0x000014962be77000)

libdl.so.2 => /lib64/libdl.so.2 (0x000014962942a000)

libpthread.so.0 => /lib64/libpthread.so.0 (0x000014962920a000)

libutil.so.1 => /lib64/libutil.so.1 (0x0000149629006000)

libm.so.6 => /lib64/libm.so.6 (0x0000149628c84000)

libipt.so.2 => not found

libiga64.so => not found

libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00001496288ef000)

libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00001496286d7000)

libc.so.6 => /lib64/libc.so.6 (0x0000149628312000)

/lib64/ld-linux-x86-64.so.2 (0x000014962bc4d000)

 

So, I think I need to do some works to find libipt.so.2 (and potentially libiga64.so too).

There is similar session with exactly same error. (https://community.intel.com/t5/Intel-C-Compiler/How-to-use-GDB/m-p/1164515)

Unfortunaetely, the session was ended without any confirmation. 

 

I hope you answer this question and 2,3) in my original post on 06-03-2023. 

 

Thanks. 

0 Kudos
GPU_AI
Novice
708 Views

I was able to run gdb-oneapi after typing the following line:

 

source /opt/intel/oneapi/debugger/2021.6.0/env/vars.sh

 

As of now, the only remaining question is running gdb-oneapi with the "qsub" command. 

 

The "-gtool" option shows the promise. 

 

Can you help me on this?

 

Thanks.

 

0 Kudos
VarshaS_Intel
Moderator
692 Views

Hi,

 

Thanks for the details.

 

>> Can I do this with multiple processors (like -np 4) even for running gdb?

To accomplish this task you can use -gtool where we can attach any specific process(or all processes). 

To attach a specific process, use the below example command:

mpirun -n 4 -gtool "gdb-oneapi:2,3=attach" ./a.out

To attach all processes, use the below example command:

mpirun -n 4 -gtool "gdb-oneapi:all=attach" ./a.out

For more information, please refer to the below link:

https://www.intel.com/content/www/us/en/docs/mpi-library/developer-guide-linux/2021-6/using-gtool-for-debugging.html

 

>>Do you think that "qsub mpirun -np 4 gdb-oneapi ./a.out" will work properly?

No, we can only provide a script to run the MPI jobs using qsub. 

Example: qsub jobscript.sh

The above command will generate an output and error file after executing the job script.

Since debugging using gdb debugger requires manual intervention, using qsub is not recommended for debugging.

Instead, we can launch a compute node using qsub(example: qsub -I), and then try debugging using the above commands.

 

Thanks & Regards,

Varsha

 

0 Kudos
VarshaS_Intel
Moderator
600 Views

Hi,


We have not heard back from you. Could you please let us know if you have any other queries?


Thanks & Regards,

Varsha


0 Kudos
Reply