- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(bash) niwot.pts/13% mpirun -np 3 wrapper_ifort_omp_g.ex
wrapper: start
wrapper: start
wrapper: start
[niwot.cr.usgs.gov:21906] *** An error occurred in MPI_Comm_set_errhandler
[niwot.cr.usgs.gov:21906] *** on communicator MPI_COMM_WORLD
[niwot.cr.usgs.gov:21906] *** MPI_ERR_ARG: invalid argument of some other kind
[niwot.cr.usgs.gov:21906] *** MPI_ERRORS_ARE_FATAL (goodbye)
[niwot.cr.usgs.gov:21908] *** An error occurred in MPI_Comm_set_errhandler
[niwot.cr.usgs.gov:21908] *** on communicator MPI_COMM_WORLD
[niwot.cr.usgs.gov:21908] *** MPI_ERR_ARG: invalid argument of some other kind
[niwot.cr.usgs.gov:21908] *** MPI_ERRORS_ARE_FATAL (goodbye)
[niwot.cr.usgs.gov:21907] *** An error occurred in MPI_Comm_set_errhandler
[niwot.cr.usgs.gov:21907] *** on communicator MPI_COMM_WORLD
[niwot.cr.usgs.gov:21907] *** MPI_ERR_ARG: invalid argument of some other kind
[niwot.cr.usgs.gov:21907] *** MPI_ERRORS_ARE_FATAL (goodbye)
mpirun noticed that job rank 2 with PID 21908 on node niwot.cr.usgs.gov exited on signal 41 (Real-time signal 7).
These messages are not part of the MPI application. When I execute with the TotalView debugger as follows, then I get the impression that the job isn't attaching properly to the processes:
(bash) niwot.pts/13% mpirun -tv -np 3 wrapper_ifort_omp_g.ex
Copyright 2007-2009 by TotalView Technologies, LLC. ALL RIGHTS RESERVED.
Copyright 1999-2007 by Etnus, LLC.
Copyright 1999 by Etnus, Inc.
Copyright 1996-1998 by Dolphin Interconnect Solutions, Inc.
Copyright 1989-1996 by BBN Inc.
TotalView Technologies ReplayEngine
Copyright 2009 TotalView Technologies
ReplayEngine uses the UndoDB Reverse Execution Engine
Copyright 2005-2009 Undo Limited
Reading symbols for process 1, executing "mpirun"
Library /usr/bin/orterun, with 2 asects, was linked at 0x08048000, and initially loaded at 0x10000000
.
.
.
wrapper: start
wrapper: start
wrapper: start
Can't attach to group member - perhaps because the executable was not found: process not found
Couldn't attach to process 22141 in cluster 0, node 1 -- skipping it
Couldn't attach to process 22142 in cluster 0, node 1 -- skipping it
Couldn't attach to process 22143 in cluster 0, node 1 -- skipping it
[niwot.cr.usgs.gov:22131] [0,0,0]-[0,1,0] mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104)
These messages are emitted after attempting to attach to all processes in TotalView and then starting the execution.
Is there something else or in addition that I should be doing here?
-- Rich Naff
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Rich,
Thanks for getting in touch with us. Are you successfully able to run OpenMPI with the Intel Fortran Compiler for a simpler application? Something like an MPI Hello World program, just to make sure that your installation of OpenMPI and the Intel Fortran Compiler are okay.
Additionally, have you compiled OpenMPI itself using the Intel Compilers? More information on how that's done is available here.
If you believe this might be an issue with the Intel Fortran Compilers, you can submit a request to the development team via the Intel Premier Support site.
Thanks and regards,
~Gergana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Rich,
Thanks for getting in touch with us. Are you successfully able to run OpenMPI with the Intel Fortran Compiler for a simpler application? Something like an MPI Hello World program, just to make sure that your installation of OpenMPI and the Intel Fortran Compiler are okay.
Additionally, have you compiled OpenMPI itself using the Intel Compilers? More information on how that's done is available here.
If you believe this might be an issue with the Intel Fortran Compilers, you can submit a request to the development team via the Intel Premier Support site.
Thanks and regards,
~Gergana
Gergana: My Hello program appears to be functioning:
(bash) niwot.pts/13% mpirun -np 3 hello.ex
Hello, world! I am 1 of 3
Hello, world! I am 2 of 3
Hello, world! I am 0 of 3
We have -not- compiled OpenMPI itself using the Intel Compilers; our version of OpenMPI is that which comes with the Mandriva release. I can request that our system administrator do so if you believe this to be the problem; please advise me acordingly.
--Rich
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Gergana: Okay we did follow your suggestion and rebuilt the OpenMPI installation using the Intel compilers; the MPI application now works. Now I can do the TotalView testing.
Thanks, Rich
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Rich,
I'm glad to hear everything worked out. If you do have further problems, in addition to the OpenMPI resources Tim mentioned, I can also suggest visiting the Intel Fortran Compiler forums.
Regards,
~Gergana
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page