- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone,
currently I have a job running on mpi using fortran 90 language (compiled with Intel compiler).
I have the mode master-slave, but the slaves don't communicate
between them. Each one executes a separate task from the others.
The only difference between two processes is only one parameter,
say k, wich can take different values (from 1 to 20).
I have 6 nodes executing the program, one is the master and the other
five are the slaves. In all the runnings I attempted (5 differents jobs) the program never
stops running; this means that stays in an infinite loop trapped for some
reason. The feature that I notice is that when k=15, the node in charge
to do the task never returns the info. to the master node, for some reason
doesn't "execute":
Call MPI_Send( masterBuf, 3, MPI_INTEGER, 0, 0, MPI_COMM_WORLD, ierrMPI )
I'm almost sure that the problem is not on the program that the node executes
because I have an echo that tells me when the script files close (at the end of the program),
and also in this bad process. .
currently I have a job running on mpi using fortran 90 language (compiled with Intel compiler).
I have the mode master-slave, but the slaves don't communicate
between them. Each one executes a separate task from the others.
The only difference between two processes is only one parameter,
say k, wich can take different values (from 1 to 20).
I have 6 nodes executing the program, one is the master and the other
five are the slaves. In all the runnings I attempted (5 differents jobs) the program never
stops running; this means that stays in an infinite loop trapped for some
reason. The feature that I notice is that when k=15, the node in charge
to do the task never returns the info. to the master node, for some reason
doesn't "execute":
Call MPI_Send( masterBuf, 3, MPI_INTEGER, 0, 0, MPI_COMM_WORLD, ierrMPI )
I'm almost sure that the problem is not on the program that the node executes
because I have an echo that tells me when the script files close (at the end of the program),
and also in this bad process. .
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
you'll need to get a good MPI-aware debugger, such as TotalView, and get this into a debugger to find out what's going on the with slave processes. OR perhaps put in a clever and useful set of WRITE(*,*) statements to determine what is going wrong with the logic in the handoff/return of the task.
ron
ron
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page