Software Archive
Read-only legacy content

IDB does not exit

Pierpaolo_M_
New Contributor I
688 Views

Hi,

I am trying to use idb to debug a simple mpi program (Fortran) that write hostname of each rank, to see how it works, but i have a problem.

First, i set IDB_HOME, IDB_PARALLEL_SHELL and MPIEXEC_DEBUG=1.

Then i try to start idb:

[cscppm59@imip15 MPI-INTEL]$ mpiexec.hydra -idb -f ./dbg.hosts -n 2 a.out 
mpiexec: idb -pid 17765 -mpi2 -parallel mpiexec.hydra
Intel(R) Debugger for applications running on Intel(R) 64, Version 13.0, Build [80.483.23]
Attaching to program: /opt/intel/impi/4.1.3.048/intel64/bin/mpiexec.hydra, process 17765
[New Thread 17765 (LWP 17765)]
Reading symbols from /opt/intel/impi/4.1.3.048/intel64/bin/mpiexec.hydra...done.
__select_nocancel () in /lib64/libc-2.12.so
Continuing.
MPIR_Breakpoint () at /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/pm/hydra/tools/debugger/debugger.c:24
No source file named /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/pm/hydra/tools/debugger/debugger.c.
(idb) 
   [0:1] Intel(R) Debugger for applications running on Intel(R) 64, Version 13.0, Build [80.483.23]
%1 [0:1] Attaching to program: /home/users/cscppm59/Prove/MPI-INTEL/a.out, process [17770;17771]
%2 [0:1] [New Thread [17770;17771] (LWP [17770;17771])]
   [0:1] Reading symbols from /home/users/cscppm59/Prove/MPI-INTEL/a.out...done.
   [0:1] MPIR_WaitForDebugger () at /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/mpi/debugger/dbginit.c:270
   [0:1] No source file named /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/mpi/debugger/dbginit.c.

(idb) 
   [0:1] error: cannot return to function main

(idb) 
   [0:1] Source file not found or not readable, tried...
   [0:1]     /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/mpi/debugger/dbginit.c
   [0:1]     ./dbginit.c
   [0:1]     /home/users/cscppm59/Prove/MPI-INTEL/dbginit.c

(idb)

Using commands 'where' and 'up' i am able to set a breakpoint, and it works fine:

(idb) where
(idb) 
%3 [0:1] #0  0x00007f05ab36234b in MPIR_WaitForDebugger () at /tmp/76630222304487307376132b211.25617324.3211.20140124e270dcbeedc.xtmpdir.nnlmpicl_e/mpie.nnlmpibld/dev/src/mpi/debugger/dbginit.c:
%4 [0:1] #1  0x00007f05ab3e1870 in MPIR_Init_thread (argc=0x0, argv=0x0, required=0, provided=0x7fff8b16f25c) at /tmp/76630222304487307376132b211.25617324.3211.20140124e733dcbeedc.xtmpdir.nnlmpicl_e/mpie.nnlmpibld%5 [0:1] #2  0x00007f05ab3ce290 in PMPI_Init (argc=0x0, argv=0x0) at /tmp/76630222304487307376132b211.25617324.3211.20140124e195dcbeedc.xtmpdir.nnlmpicl_e/mpie.nnlmpibld/dev/src/mpi/init/init.c:%6 [0:1] #3  0x00007f05ab99531f in mpi_init_ () in /opt/intel/impi/4.1.3.048/intel64/lib/libmpigf.so..4.1
   [0:1] #4  0x0000000000401140 in mpitest () at /home/users/cscppm59/Prove/MPI-INTEL/mpi.f:9

(idb) up
(idb) 
%7 [0:1] #1  0x00007f63613ad870 in MPIR_Init_thread (argc=0x0, argv=0x0, required=0, provided=0x7fff16c203dc) at /tmp/76630222304487307376132b211.25617324.3211.20140124e733dcbeedc.xtmpdir.nnlmpicl_e/mpie.nnlmpibld   [0:1] No source file named /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/mpi/init/initthread.c.

(idb) 
(idb) 
%8 [0:1] #2  0x00007f05ab3ce290 in PMPI_Init (argc=0x0, argv=0x0) at /tmp/76630222304487307376132b211.25617324.3211.20140124e195dcbeedc.xtmpdir.nnlmpicl_e/mpie.nnlmpibld/dev/src/mpi/init/init.c:   [0:1] No source file named /tmp/7b663e0dc22b2304e487307e376dc132.xtmpdir.nnlmpicl211.25617_32e/mpi4.32e.nnlmpibld11.20140124/dev/src/mpi/init/init.c.

(idb) 
(idb) 
%9 [0:1] #3  0x00007f636196131f in mpi_init_ () in /opt/intel/impi/4.1.3.048/intel64/lib/libmpigf.so..4.1

(idb) 
(idb) 
    [0:1] #4  0x0000000000401140 in mpitest () at /home/users/cscppm59/Prove/MPI-INTEL/mpi.f:9
    [0:1] 9	      call MPI_INIT(ierr)

(idb) b 15
(idb) 
    [0:1] Breakpoint 1 at 0x401316: file /home/users/cscppm59/Prove/MPI-INTEL/mpi.f, line 15.

(idb) c
(idb) 
    [0:1] Continuing.
    [0:1] 
    [0:1] Breakpoint 1, mpitest () at /home/users/cscppm59/Prove/MPI-INTEL/mpi.f:15
    [0:1] 15	      aus=0.87

(idb) where
(idb) 
    [0:1] #0  0x0000000000401316 in mpitest () at /home/users/cscppm59/Prove/MPI-INTEL/mpi.f:15

(idb) n 
(idb) 
    [0:1] 17	      do j=1,1

(idb) p aus
(idb) 
    [0:1] $1 = 0.870000005

(idb)

This is a simple program without error. 

 If i try simply to quit idb, program exits normally:

(idb) q
imip15.ba.imip.cnr.it
imip15.ba.imip.cnr.it
[cscppm59@imip15 MPI-INTEL]$

Instead, if i try to continue, i obtain 'program exited normally' like when i use gdb, but idb starts to write on the screen infinitely (idb) in a way like this:

(idb) c
(idb) 
    [0:1] Continuing.
imip15.ba.imip.cnr.it
imip15.ba.imip.cnr.it
    [0:1] Program exited normally.

(idb) (idb) (idb) [cscppm59@imip15 MPI-INTEL]$ (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb) (idb)

To stop this i must kill the process.

1) Is there a way to avoid this?

2) Is there a way to start an MPI parallel debug session in GUI mode?

Thanks 

Pierpaolo

 

0 Kudos
5 Replies
Rob_Mueller-Albrecht
688 Views

Dear Pierpaolo,

Q1:

Could you let me know which exact version of MPI you are using and if there is anythign special about the mpiexec.hydra script you are using?

In principle the syntax how you invoke IDB with MPI seems correct and I don't see anything obvious that should lead to the problem you observe. Thus it may be a product defect of a compatibility issue. I'd like to understand the details of your installed versions of all involved components first, before drawing conclusions though.

As you may know development on the IDB debugger is deprecated. The Fortran Composer XE2013 SP1 does include an Intel built version of GDB that includes improvements of Fortran debug capabilities over stock GDB. It may be worthwhile for you to have a look at it as well.

To start a new MPI job under the debugger's control:

  • If you use MPICH, enter the following command in a shell:

    mpirun -dbg=idb -np number_of_processes [ other_MPICH_options ] executable_filename [ application_arguments ]

  • If you use Intel® MPI 3.0

    mpiexec -idb -n number_of_processes [ other_Intel_MPI_options ] executable_filename [ application_arguments ]

  • If you use prun

    idb [ idb_options ] -parallel ‘which prun` -n number_of_processes -N Number_of_nodes [ other_prun_options ] application [ application_arguments ]

Q2:

Sorry, there is no GUI assisted MPI debug available with the Intel(R) Debugger (IDB).

Thanks, Rob

 

0 Kudos
Rob_Mueller-Albrecht
688 Views

Dear Pierpaolo,

I just noticed that we had the problem that you describe identified and fixed. We will check into the latest status and let you know if it is just a problem with your specific IDB version.

Thanks, Rob

0 Kudos
Pierpaolo_M_
New Contributor I
688 Views

Dear Rob,

thanks for your reply. I am using Intel® Cluster Studio XE 2013 SP1 Update 1:

Intel® C++ Compiler XE 14.0 Update 2
Intel® Fortran Compiler XE 14.0 Update 2
Intel® Debugger 13.0 Update 1 (for Linux* OS only)
GNU* Project Debugger (GDB*) 7.5
Intel® Integrated Performance Primitives (IPP) 8.1
Intel® Threading Building Blocks (TBB) 4.2 Update 3
Intel® Math Kernel Library (MKL) 11.1 Update 2
Intel® MPI Library 4.1 Update 3
Intel® Trace Analyzer and Collector 8.1 Update 4
Intel® MPI Benchmarks 3.2 Update 4
Intel® Advisor XE 2013 Update 5
Intel® Inspector XE 2013 Update 9
Intel® VTune™ Amplifier XE 2013 Update 15

Thanks in advance for your time.

Pierpaolo

 

0 Kudos
Rob_Mueller-Albrecht
688 Views

Dear Pierpaolo,

according to our development team the exit behavior should behave normally if you use the regular mpiexec that we validated IDB against. They ssupect that the issue is somehow related to the use of mpiexec.hydra, which we have not tested.

Can you try to see what happens if you use plain mpiexec?

Thanks, Rob

 

0 Kudos
Pierpaolo_M_
New Contributor I
688 Views

Dear Rob,

I tried to use mpiexec and not mpiexec.hydra and also in this case it works fine with gdb while idb is not able to start, as you can see in these screenshots.

GDB

[cscppm59@imip15 MPI-INTEL]$ mpdboot -v -n 1 -f ./dbg.hosts -r ssh
running mpdallexit on imip15.ba.imip.cnr.it
LAUNCHED mpd on imip15.ba.imip.cnr.it  via  
RUNNING: mpd on imip15.ba.imip.cnr.it
[cscppm59@imip15 MPI-INTEL]$ mpiexec -gdb -n 2 -wdir /raid0/users/cscppm59/Prove/MPI-INTEL a.out 
0-1:  (gdb) r
0-1:  Continuing.
0:  Detaching after fork from child process 22324.
1:  Detaching after fork from child process 22323.
0-1:  imip15.ba.imip.cnr.it
0-1:  
0-1:  Program exited normally.
0-1:  (gdb) 0-1:  (gdb) q
0-1:  MPIGDB ENDING
[cscppm59@imip15 MPI-INTEL]$ 

IDB

[cscppm59@imip15 MPI-INTEL]$ mpdboot -v -n 1 -f ./dbg.hosts -r ssh
running mpdallexit on imip15.ba.imip.cnr.it
LAUNCHED mpd on imip15.ba.imip.cnr.it  via  
RUNNING: mpd on imip15.ba.imip.cnr.it
[cscppm59@imip15 MPI-INTEL]$ mpiexec -idb -n 2 -wdir /raid0/users/cscppm59/Prove/MPI-INTEL a.out 
/opt/intel//impi/4.1.3.048/intel64/bin/mpiexec:1269: RuntimeWarning: Python C API version mismatch for module mtv: This Python has API version 1013, module mtv has version 1012.
  import mtv
sh: xterm: command not found

In this last case, i need to kill manually the process to end it, otherwise it remains blocked in this way. CRTL-C is not able to stop the process.

Thanks again for your suggestions

Pierpaolo

0 Kudos
Reply