- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have an MPI code that works fine on my windows machine vs2010. It has one master process that has MPI_COMM_ACCEPT ed a connection to another job that is running two MPI procs. This setup also works when I have the process running on my intel cluster node as long as it is only a one process job that has been accepted. But when I try two I get the message:
Internal Error: invalid error code 489e0e (Ring ids do not match) in MPIR_Barrier_impl:712 Fatal error in PMPI_Barrier: Other MPI error, error stack: PMPI_Barrier(949).....: MPI_Barrier(comm=0x84000000) failed MPIR_Barrier_impl(720): Failure during collective MPIR_Barrier_impl(712):
I note that there have some complaints of 'Ring ids do not match' for the latest mphich2 release.
Any help would be appreciated.
I am running Intel13 level of software. Is is a Fortran code
ifoMPI_INCLUDE=/opt/lic/intel13/impi/4.1.0.024/include64 LIBRARY_PATH=/opt/lic/intel13/impi/4.1.0.024/lib64:/opt/lic/intel13/composer_xe_2013.1.117/tbb/lib/intel64:/opt/lic/intel13/composer_xe_2013.1.117/mkl/lib/intel64:/opt/lic/intel13/
composer_xe_2013.1.117/ipp/lib/intel64:/opt/lic/intel13/composer_xe_2013.1.117/compiler/lib/intel64 rt version 13.0.1
I also have since tried using MPICH-3.0.2 with the same results.
Any ideas out there?
Dave
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dave,
I don't think this will solve the problem, but try running both jobs with I_MPI_ADJUST_BARRIER=1. It is possible that the jobs are selecting different algorithms.
Do you have a reproducer you can share?
Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page