Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Christoph_F_
Beginner
122 Views

one-sided communication and shared memory (single process)

Hello all,

I have come across the following problem with one-sided communication using MPI_ACCUMULATE. The versions are:
ifort (IFORT) 19.0.3.199 20190206
Intel(R) MPI Library for Linux* OS, Version 2019 Update 3 Build 20190214 (id: b645a4a54)

The attached program does a very basic calculation using one-sided communication with MPI_ACCUMULATE (and MPI_WIN_FENCE to synchronize). Compile it with

mpif90 test.f donothing.f

The program accepts a command line argument. For example,

mpiexec -np 1 ./a.out 10

simply runs the calculation ten times (on a single process).

When I run the program, it crashes with a segmentation fault in MPI_WIN_FENCE if the argument is larger than 8615. (Or around that number.) But only if one (!) process is used. For any other number of processes, the program run is successful!

When I set FI_PROVIDER to tcp (unset before), the behavior is different: Then, the program run gets stuck for an argument larger than 12, and for very large arguments, the program crashes with "Fatal error in PMPI_Win_fence: Other MPI error".

(The dummy routine "donothing" is a substitution for "mpi_f_sync_reg", which does not exist in this version of IntelMPI.)

Thank you.

Best wishes
Christoph

0 Kudos
5 Replies
PrasanthD_intel
Moderator
122 Views

Hi,

I have tried to reproduce your issue with the given code and the program ran smoothly even with input as big as 100000. 

Could you please specify more details regarding your issue and the environment you are using.

 

Thanks

Prasanth

Christoph_F_
Beginner
122 Views

Hi Prasanth,

thank you very much! Could you tell me what versions you used?

The environment is
- Two Intel Xeon E5-2680 v3 Haswell CPUs per node
- 2 x 12 cores, 2.5 GHz
- Intel Hyperthreading Technology (Simultaneous Multithreading)
- AVX 2.0 ISA extension

Perhaps the problem is rather related to our installation/hardware.

Best wishes
Christoph

PrasanthD_intel
Moderator
122 Views

Hi Christoph,

First i have checked with IMPI 2019.6 which is the latest version available and it seems to work fine.

But now i have checked with your version IMPI 2019.3 and faced similar kind of errors. I will raise this issue with the concerned team.

Meanwhile is it possible for you to update your IMPI to the latest version and confirm me that it is working fine without any errors.

And could you please specify what is the default libfabric provider before you set  FI_PROVIDER to tcp. You can check that by setting environment variable I_MPI_DEBUG=5.

 

Thanks

Prasanth

Christoph_F_
Beginner
122 Views

Hi Prasanth,

thank you. Indeed, with 2019.6, the program runs successfully.

So, it seems to be a library bug. The information you asked about libfabric is:

[0] MPI startup(): libfabric version: 1.8.0a1
[0] MPI startup(): libfabric provider: verbs;ofi_rxm

Best wishes
Christoph

 

PrasanthD_intel
Moderator
122 Views

Hi Chrisptoph,

Glad to hear that your code runs.

The issue might have been fixed in the newer version.

We are closing this thread now . Please raise a new thread if you face any further issues.

 

Thanks

Prasanth

Reply