- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello.
I'm using oneMKL ScaLAPACK.
I did some code works and manage to run it on my Linux machine with sungrid engine.
The command line for compiling is :
mpiicc FT_13.100.c -o FT_13.100.x -L/opt/intel/oneapi/mkl/2022.1.0/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_ilp64 -lpthread -lm -ldl -DMKL_ILP64,
after this I use qsub command to execute it on the computing nodes.
I have the free() : invalid pointer problem for a given parameter so I want to use a debugger.
I tried to use Valgrind, but it doesn't looks to work properly.
I want to use gdb to know where and why I get this error.
Especially, can you let me know how to run gdb with submitting job through sungrid engine?
It looks like gdb in oneAPI is already installed in "/opt/intel/oneapi/debugger/2021.6.0/gdb/" and its subfolders.
Thanks.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for posting in Intel Communities.
Could you please try using the below command:
mpiicc -DMKL_ILP64 -I"${MKLROOT}/include" scalapack.cpp ${MKLROOT}/lib/intel64/libmkl_scalapack_ilp64.a -Wl,--start-group ${MKLROOT}/lib/intel64/libmkl_intel_ilp64.a ${MKLROOT}/lib/intel64/libmkl_sequential.a ${MKLROOT}/lib/intel64/libmkl_core.a ${MKLROOT}/lib/intel64/libmkl_blacs_intelmpi_ilp64.a -Wl,--end-group -lpthread -lm -ldl -g
mpirun -np 1 gdb-oneapi ./a.out
(or) you can use the Intel Link Line Advisor for the exact command for your purpose.
If you want to debug your code you need to add the '-g' option to it.
Please find the below screenshot where we have tried an MKL ScaLapack Example( /opt/intel/oneapi/mkl/2023.1.0/examples/c_mpi/scalapack/source/pcgetrf_example.c):
And also, could you please try and let us know if you are able to run? If not, could you please let us know the OS details, and processor and provide us with the sample reproducer code and steps you have followed to investigate your issue?
Thanks & Regards,
Varsha
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply.
First of all, it seems that you suggest running gdb with the mpirun command.
There are three points I want to make clear.
1) First of all, when I tried to do "mpirun -np 1 gdb-oneapi ./a.out" I got
"[proxy:0:0@dollygo] HYD_spawn (../../../../../src/pm/i_hydra/libhydra/spawn/intel/hydra_spawn.c:151): execvp error on file gdb-oneapi (No such file or directory)"
Server OS : Rocky Linux release 8.5 (Green Obsidian)
Processor : Intel(R) Xeon(R) CPU X5660 @ 2.80GHz (processor for master node)
I think it is the issue of setting $PATH environemnt. I googled it but was not able to manage it.
2) Can I do this with multiple processors (like -np 4) even for running gdb?
3) As I wrote in the question, I'd like to know if there is a way to run gdb with the qsub command.
Do you think that "qsub mpirun -np 4 gdb-oneapi ./a.out" will work properly?
I think the qsub command will submit my jobs to computing nodes in my Linux server.
(I hope to work with gdb on my master node as shown in the screenshot)
I want to have some advice from an expert before I try it by myself.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
After it, I searched up and tried
"mpirun -np 1 {gdbpath}/gdb-oneapi ./a.out" .
Now, I have a little trouble saying "libipt.so.2: cannot open shared object file: No such file or directory".
When I enter "ldd {gdbpath}/gdb-oneapi" I get,
linux-vdso.so.1 (0x000014962be77000)
libdl.so.2 => /lib64/libdl.so.2 (0x000014962942a000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x000014962920a000)
libutil.so.1 => /lib64/libutil.so.1 (0x0000149629006000)
libm.so.6 => /lib64/libm.so.6 (0x0000149628c84000)
libipt.so.2 => not found
libiga64.so => not found
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00001496288ef000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00001496286d7000)
libc.so.6 => /lib64/libc.so.6 (0x0000149628312000)
/lib64/ld-linux-x86-64.so.2 (0x000014962bc4d000)
So, I think I need to do some works to find libipt.so.2 (and potentially libiga64.so too).
There is similar session with exactly same error. (https://community.intel.com/t5/Intel-C-Compiler/How-to-use-GDB/m-p/1164515)
Unfortunaetely, the session was ended without any confirmation.
I hope you answer this question and 2,3) in my original post on 06-03-2023.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was able to run gdb-oneapi after typing the following line:
source /opt/intel/oneapi/debugger/2021.6.0/env/vars.sh
As of now, the only remaining question is running gdb-oneapi with the "qsub" command.
The "-gtool" option shows the promise.
Can you help me on this?
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for the details.
>> Can I do this with multiple processors (like -np 4) even for running gdb?
To accomplish this task you can use -gtool where we can attach any specific process(or all processes).
To attach a specific process, use the below example command:
mpirun -n 4 -gtool "gdb-oneapi:2,3=attach" ./a.out
To attach all processes, use the below example command:
mpirun -n 4 -gtool "gdb-oneapi:all=attach" ./a.out
For more information, please refer to the below link:
>>Do you think that "qsub mpirun -np 4 gdb-oneapi ./a.out" will work properly?
No, we can only provide a script to run the MPI jobs using qsub.
Example: qsub jobscript.sh
The above command will generate an output and error file after executing the job script.
Since debugging using gdb debugger requires manual intervention, using qsub is not recommended for debugging.
Instead, we can launch a compute node using qsub(example: qsub -I), and then try debugging using the above commands.
Thanks & Regards,
Varsha
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have not heard back from you. Could you please let us know if you have any other queries?
Thanks & Regards,
Varsha
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page