- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello.
I am developing fault tolerant communicaction layer (MPI-like). So this layer is slow now.
Can I learn something about internal Intel MPI architecture to understand basic principles of big data transfering over different networks.
Can I also contact to developers of Intel MPI library to share experiences.
Nicolay.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Nicolay,
Since our implementation is proprietary, I doubt we'll be able to share any knowledge with you. But I'll forward your request to the developers.
One thing I can recommend is taking a look at the MPICH source base since Intel MPI is based on that. Furthermore, we work and closely support the Open Fabrics Alliance and their networking stack which is also open source.
Hope this helps.
Regards,
~Gergana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi ,
I have compiled espresso with intel mpi and MKL library but getting error Failure during collective error when ever it is working fine with openmpi.
is there problem with intel mpi
Fatal error in PMPI_Bcast: Other MPI error, error stack:
PMPI_Bcast(2112)........: MPI_Bcast(buf=0x516f460, count=96, MPI_DOUBLE_PRECISION, root=4, comm=0x84000004) failed
MPIR_Bcast_impl(1670)...:
I_MPIR_Bcast_intra(1887): Failure during collective
MPIR_Bcast_intra(1524)..: Failure during collective
Fatal error in PMPI_Bcast: Other MPI error, error stack:
PMPI_Bcast(2112)........: MPI_Bcast(buf=0x5300310, count=96, MPI_DOUBLE_PRECISION, root=4, comm=0x84000004) failed
MPIR_Bcast_impl(1670)...:
I_MPIR_Bcast_intra(1887): Failure during collective
MPIR_Bcast_intra(1524)..: Failure during collective
Fatal error in PMPI_Bcast: Other MPI error, error stack:
PMPI_Bcast(2112)........: MPI_Bcast(buf=0x6b295c0, count=96, MPI_DOUBLE_PRECISION, root=4, comm=0x84000004) failed
MPIR_Bcast_impl(1670)...:
I_MPIR_Bcast_intra(1887): Failure during collective
MPIR_Bcast_intra(1524)..: Failure during collective
Fatal error in PMPI_Bcast: Other MPI error, error stack:
PMPI_Bcast(2112)........: MPI_Bcast(buf=0x67183d0, count=96, MPI_DOUBLE_PRECISION, root=4, comm=0x84000004) failed
MPIR_Bcast_impl(1670)...:
I_MPIR_Bcast_intra(1887): Failure during collective
MPIR_Bcast_intra(1524)..: Failure during collective
Fatal error in PMPI_Bcast: Other MPI error, error stack:
PMPI_Bcast(2112)........: MPI_Bcast(buf=0x4f794c0, count=96, MPI_DOUBLE_PRECISION, root=4, comm=0x84000004) failed
MPIR_Bcast_impl(1670)...:
I_MPIR_Bcast_intra(1887): Failure during collective
MPIR_Bcast_intra(1524)..: Failure during collective
[0:n125] unexpected disconnect completion event from [22:n122]
Assertion failed in file ../../dapl_conn_rc.c at line 1128: 0
internal ABORT - process 0
Fatal error in PMPI_Bcast: Other MPI error, error stack:
PMPI_Bcast(2112)........: MPI_Bcast(buf=0x56bfe30, count=96, MPI_DOUBLE_PRECISION, root=4, comm=0x84000004) failed
MPIR_Bcast_impl(1670)...:
I_MPIR_Bcast_intra(1887): Failure during collective
MPIR_Bcast_intra(1524)..: Failure during collective
/var/spool/PBS/mom_priv/
Kindly help us for resolving this
Thanks
sanjiv

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page