Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2154 Discussions

idb parallel debugging "no main" problem

sdettrick
Beginner
576 Views
Hi,

I am trying to debug a parallel program in idb, but the debugger cannot find the main program and I cannot set breakpoints.

I have compiled with mpiifort -g -debug all -check all, and am using these versions:

/opt/intel/mpi/3.0/bin64/mpiifort
/opt/intel/fce/10.0.023/bin/ifort
/opt/intel/idbe/10.0.023/bin/idb
/opt/intel/mpi-rt/3.0/bin64/mpirun

I try to invoke idb like this:

mpirun -idb -np 2 ../../test_MCdriver
or:
mpd
mpiexec -idb -n 2 ../../test_MCdriver

This opens idb in a separate window as expected. It finds two processes (focus=[0:1]) but I can't set a breakpoint in the main program or in my module procedure. If I let the program continue in idb by typing "c" then the program successfully initializes MPI and runs for a bit before crashing on the error that I'm trying to fix. So in some sense the program is loaded, it is just that I can't access the symbols in idb. The output of the session follows. Any help would be appreciated!

Thanks,
Sean

Intel Debugger for applications running on Intel 64, Version 10.0-29 , Bui
ld 20070405
Attaching to program: /usr/bin/python2.4, process 21469
Reading symbols from /usr/bin/python2.4...(no debugging symbols found)...done.
[New Thread 47221900373968 (LWP 21469)]
__select_nocancel () in /lib64/libc-2.4.so

Info: Optimized variables show as when no location is allocated.
Continuing.
MPIR_Breakpoint () at mtv.c:101
No source file named mtv.c.
(idb)
[0:1] Intel Debugger for applications running on Intel 64, Version 10.0
-29 , Build 20070405
%1 [0:1] Attaching to program: /home/sdettrick/codes/converter/test_MCdriver, pr
ocess [21478;21479]
[0:1] Reading symbols from /home/sdettrick/codes/converter/test_MCdriver...do
ne.
%2 [0:1] [New Thread [47585757269312;47878512934208] (LWP [21478;21479])]
[0:1] __read_nocancel () in /lib64/libpthread-2.4.so

(idb)
[0:1] error: cannot return to function main
[0:1] No file.

(idb)
(idb)
(idb) break main
No symbol "main" in current context.
(idb)
(idb) break mc_init
No symbol "mc_init" in current context.
(idb)
[0:1] No symbol "mc_init" in current context.
[0:1] No symbol "mc_init" in current context.
[0:1] mc_init has no valid breakpoint address
[0:1] Breakpoint 2 (mc_init) pending

(idb) break mpi_stuff%%mc_init
No symbol "mpi_stuff" in current context.
break mpi_stuff%%mc_init
^
Unable to parse input as legal command or C expression.
(idb) set $cmdset="idb"
(idb)
(idb) stop in mpi_stuff%%mc_init
(idb)
[0:1] Symbol "mpi_stuff" is not defined.
[0:1] Symbol "mc_init" is not defined.
[0:1] mpi_stuff % %mc_init has no valid breakpoint address
[0:1] Warning: Breakpoint not set

(idb) stop in MPI_BCAST
(idb)
[0:1] [#3: stop in MPI_BCAST(...)]

(idb) c
(idb)
[1] Process has exited with status 153
[0] Thread received signal ABRT
[0] stopped at [ raise(...) 0x00002b476bba8aa5]

(idb)
(idb)





0 Kudos
5 Replies
srimks
New Contributor II
576 Views
Quoting - sdettrick
Hi,

I am trying to debug a parallel program in idb, but the debugger cannot find the main program and I cannot set breakpoints.

I have compiled with mpiifort -g -debug all -check all, and am using these versions:

/opt/intel/mpi/3.0/bin64/mpiifort
/opt/intel/fce/10.0.023/bin/ifort
/opt/intel/idbe/10.0.023/bin/idb
/opt/intel/mpi-rt/3.0/bin64/mpirun

I try to invoke idb like this:

mpirun -idb -np 2 ../../test_MCdriver
or:
mpd
mpiexec -idb -n 2 ../../test_MCdriver

This opens idb in a separate window as expected. It finds two processes (focus=[0:1]) but I can't set a breakpoint in the main program or in my module procedure. If I let the program continue in idb by typing "c" then the program successfully initializes MPI and runs for a bit before crashing on the error that I'm trying to fix. So in some sense the program is loaded, it is just that I can't access the symbols in idb. The output of the session follows. Any help would be appreciated!

Thanks,
Sean

Intel Debugger for applications running on Intel 64, Version 10.0-29 , Bui
ld 20070405
Attaching to program: /usr/bin/python2.4, process 21469
Reading symbols from /usr/bin/python2.4...(no debugging symbols found)...done.
[New Thread 47221900373968 (LWP 21469)]
__select_nocancel () in /lib64/libc-2.4.so

Info: Optimized variables show as when no location is allocated.
Continuing.
MPIR_Breakpoint () at mtv.c:101
No source file named mtv.c.
(idb)
[0:1] Intel Debugger for applications running on Intel 64, Version 10.0
-29 , Build 20070405
%1 [0:1] Attaching to program: /home/sdettrick/codes/converter/test_MCdriver, pr
ocess [21478;21479]
[0:1] Reading symbols from /home/sdettrick/codes/converter/test_MCdriver...do
ne.
%2 [0:1] [New Thread [47585757269312;47878512934208] (LWP [21478;21479])]
[0:1] __read_nocancel () in /lib64/libpthread-2.4.so

(idb)
[0:1] error: cannot return to function main
[0:1] No file.

(idb)
(idb)
(idb) break main
No symbol "main" in current context.
(idb)
(idb) break mc_init
No symbol "mc_init" in current context.
(idb)
[0:1] No symbol "mc_init" in current context.
[0:1] No symbol "mc_init" in current context.
[0:1] mc_init has no valid breakpoint address
[0:1] Breakpoint 2 (mc_init) pending

(idb) break mpi_stuff%%mc_init
No symbol "mpi_stuff" in current context.
break mpi_stuff%%mc_init
^
Unable to parse input as legal command or C expression.
(idb) set $cmdset="idb"
(idb)
(idb) stop in mpi_stuff%%mc_init
(idb)
[0:1] Symbol "mpi_stuff" is not defined.
[0:1] Symbol "mc_init" is not defined.
[0:1] mpi_stuff % %mc_init has no valid breakpoint address
[0:1] Warning: Breakpoint not set

(idb) stop in MPI_BCAST
(idb)
[0:1] [#3: stop in MPI_BCAST(...)]

(idb) c
(idb)
[1] Process has exited with status 153
[0] Thread received signal ABRT
[0] stopped at [ raise(...) 0x00002b476bba8aa5]

(idb)
(idb)






Hello Sean.

Few suggestions as below -
(a) Try IDB for simple serial code,check ifit works?

(b) Try the same way for parallel program, but before you start debugging parallel program, check following things -
- Which MPI are you using?
- Is that MPI being compiled with -g options?


~BR
0 Kudos
TimP
Honored Contributor III
576 Views
Quoting - sdettrick

Attaching to program: /usr/bin/python2.4, process 21469
Reading symbols from /usr/bin/python2.4...(no debugging symbols found)...done.
[New Thread 47221900373968 (LWP 21469)]
__select_nocancel () in /lib64/libc-2.4.so

Info: Optimized variables show as when no location is allocated.
Continuing.
MPIR_Breakpoint () at mtv.c:101
No source file named mtv.c.


When you attach idb to python , it's hardly surprisiing that your C main() isn't present. Are you trying to debug python or your own C code?
0 Kudos
sdettrick
Beginner
576 Views
Quoting - tim18
When you attach idb to python , it's hardly surprisiing that your C main() isn't present. Are you trying to debug python or your own C code?


I am trying to debug MPI FORTRAN code.

The python in question is the INTEL MPI mpiexec command. I don't know how to avoid using it, when submitting with multiple processes.

Is it possible to break out of this python script, or continue through it or something, to arrive in my FORTRAN code?
0 Kudos
sdettrick
Beginner
576 Views
Quoting - srimks

Hello Sean.

Few suggestions as below -
(a) Try IDB for simple serial code,check ifit works?

(b) Try the same way for parallel program, but before you start debugging parallel program, check following things -
- Which MPI are you using?
- Is that MPI being compiled with -g options?


~BR

(a) yes it works
(b) intel MPI. My code is compiled with -g, I don't know about the library, which came as is.
0 Kudos
TimP
Honored Contributor III
576 Views
If you are using an MPI such as Intel which has an option in mpiexec or mpirun specifically meant to start up parallel debugging on your code, try it. Otherwise, set up the debugger environment (not trivial for idb, as it has some gotchas), and do something like
mpiexec -n 2 idb yourapp
which would likely attempt to open an idb gui window for each process.
When you ask an ordinary debugger to start up and debug the mpi, all you get is an attempt to do what you said, maybe not what you meant.
We have been waiting for months for promised training on idb, so I don't feel like an expert.
TotalView is the first-class route for MPI debugging; there are noises about a serious effort to set up the Allinea product specifically for Intel MPI.
0 Kudos
Reply