Community
cancel
Showing results for 
Search instead for 
Did you mean: 
222 Views

Bugged MPICH 3.3b2 used in Parallel Studio 2019 initial release

Hi, I just realized that the Parallel Studio 2019 initial release is using MPICH 3.3b2 which is a buggy release as reported here: https://lists.mpich.org/pipermail/discuss/2018-April/005447.html I confirm that the tag upper limit initialization is not fixed in this release (see mpich commit: c597c8d79deea22) and is a problem for all PETSc users. What a bad choice for an official release!!! Please consider releasing with MPICH 3.3b3 instead, or no MPI at all! Eric
0 Kudos
9 Replies
James_T_Intel
Moderator
222 Views

Eric,

We actually don't include MPICH in Intel® Parallel Studio XE.  We include our own implementation, Intel® MPI Library.  Can you please confirm which MPI you are using?

mpirun -v

Our engineering team has implemented a fix for the bug mentioned above, and expects you could be encountering a different problem.

James_T_Intel
Moderator
222 Views

Can you also provide the exact configuration and tests which are failing?  Our engineering team tested with the problem test reported in the mpich-maillist and did not get any errors.

$ mpirun -n 2 src/ksp/ksp/examples/tutorials/ex2
Norm of error 0.000411674 iterations 7
$ I_MPI_FABRICS=ofi mpirun -n 2 src/ksp/ksp/examples/tutorials/ex2
Norm of error 0.000411674 iterations 7

 

222 Views

Ok, give me some time to re-install everything since I completely wiped out everything...

I may get back to you only on oct. 15 since I am out next week...

Eric

 

222 Views

BTW,

please help me understand:

Why the mpi.h file distributed is the MPICH one?

When trying to identify the MPI flavor at *compile time*, it is impossible! (see https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/topic/797705)

Thanks!

Eric

James_T_Intel
Moderator
222 Views

I'm checking with our engineering team regarding the information in your other thread.

222 Views

Ok, I reinstalled everything...

to confirm the version I used:

which mpirun /opt/intel/compilers_and_libraries_2019.0.117/linux/mpi/intel64/bin/mpirun

-bash-4.4$ mpirun -v

[mpiexec@thorin] main (../../../../../src/pm/hydra2/mpiexec/mpiexec.c:1645): assert (pg->total_proc_count * sizeof(int)) failed

-bash-4.4$ mpirun -V

Intel(R) MPI Library for Linux* OS, Version 2019 Build 20180829 (id: 15f5d6c0c) Copyright 2003-2018, Intel Corporation.

Also, I tried the PETSc example, and it works! :/

So my problem is different then the one I thinked...

I have an error returned by a "simple" MPI_Isend. I am trying to produce a MWE to give it to you.

Eric

222 Views

Ok,

this is still a "tag" bug.

I have simple MWE attached, which gives the following explicit error:

Abort(67744516) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Irecv: Invalid tag, error stack:
PMPI_Irecv(156): MPI_Irecv(buf=0x7ffff2268900, count=10, MPI_INT, src=0, tag=123454321, MPI_COMM_WORLD, request=0x7ffff2268974) failed
PMPI_Irecv(100): Invalid tag, value is 123454321

 

If you compile the attached example with OpenMPI or MPICH, it works!

Hope this help!

Thanks,

Eric

 

222 Views

Doh!

Just realized that MPI standard tag limit is 32767... :/

But it is not activated for other MPI flavours..

So it is *my* error, sorry!

Eric

James_T_Intel
Moderator
222 Views

No worries.  Since this is resolved, I'm going to go ahead and close this thread.

Reply