Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2226 Discussions

INTEL-MPI-5.0: Bug in MPI-3 shared-memory allocation (MPI_WIN_ALLOCATE_SHARED, MPI_WIN_SHARED_QUERY)

Michael_R_2
Beginner
2,821 Views

Dear developers of Intel-MPI,

First of all:   Congratulations, that INTEL-MPI now supports also MPI-3 !

However, I found a bug  in INTEL-MPI-5.0 when running the MPI-3 shared memory feature (calling MPI_WIN_ALLOCATE_SHARED, MPI_WIN_SHARED_QUERY) on a Linux Cluster (NEC Nehalem)  by a  Fortran95 CFD-code.

I isolated the problem into a small Ftn95 example program, which allocates shared an integer*4-array of array dimension N , then uses it by the MPI-processes (on the same node), and then repeats the same for the next shared allocation. So, the number of shared windows do accumulate in the run, because I do not free the shared windows allocated so far. This allocation of shared windows works, but only until the total number of allocated memory exceeds a limit of ~30 millions of Integer*4 numbers (~120 MB).

When that limit is reached, the next call of MPI_WIN_ALLOCATE_SHARED, MPI_WIN_SHARED_QUERY  to allocated one more shared window do not give an error message, but the 1st attempt to use that allocated shared array results in a bus error (because the shared array has not been allocated correctly).

 

The problem is independent of the number of MPI-processes started by mpirun on the node (I used only 1 node)

   Example:     N=      100 000   à  bus error occurred at iwin=288   (i.e. the allocation of the 288-th shared window had failed)

                         N=   1 000 000   à  bus error occurred at iwin=  30

                         N=   5 000 000   à  bus error occurred at iwin=    6

                         N= 28 000 000   à  bus error occurred at iwin=    2

                         N= 30 000 000   à  bus error occurred at iwin=    1   (i.e. already the 1st allocation failed)

 

The node on the cluster has 8 Nehalem cores, and had a free memory of 10 GB, and I was the only user on it. I used the INTEL-13 and also the INTEL-14 compiler for compiling the example program.

       mpiifort -O0 -debug -traceback -check -fpe0 sharedmemtest.f90

       mpirun -binding -prepend-rank -ordered-output -np 4 ./a.out

If it is helpful for you, I could send you the source code of the program.

It seems to me, that there is an internal storage limitation in the implementation of the MPI-3 shared memory feature in INTEL-MPI-5.0 . I cannot use INTEL-MPI in my real CFD-code with that limitation, because in case of very large grids the total storage allocated simultaneously by the shared windows can exceed 10 GB.

Greetings to you all

 Michael R.

 

 

 

 

 

 

 

 

0 Kudos
8 Replies
James_T_Intel
Moderator
2,821 Views

This is a known bug at this time, and we are working to correct it.

0 Kudos
Qinghua_W_
Beginner
2,821 Views

James,

Is there a limit on the size of MPI-3 shared memory? In my tests, it seems that the maximum size is about 4GB for each shared window. The total number of windows cannot exceed 16381. Please confirm. I used intel MPI 2017.1.143, and all the tests were performed on HP workstation with 16 cores and 64GB RAM.

Thanks,

 

Qinghua 

0 Kudos
James_T_Intel
Moderator
2,821 Views

I'm checking with our engineering team regarding any limits set.

0 Kudos
James_T_Intel
Moderator
2,821 Views

We do not have any specific limits set for the size of a window.  There is a limit of 16381 for the number of windows, due to a limitation on the number of communicators.

0 Kudos
DavidM1
Beginner
1,223 Views

Hello James,

I have been trying to debug a crash in my application code when using MPI_Win_allocate_shared.  The crash seems to occur when the number of allocations gets too large.  I created a simple C++ test program on Linux, and it appears to crash after approximately 32,000 allocations (all still in memory).  This number seems to be fairly consistent, regardless of the size of the window I allocate or the number of processes I run on.  I am using Intel MPI Library Version 2021.1 Build 20201112.

Is there still a limit for the number of shared memory windows using Intel MPI?  Is this limit larger in more recent versions of the library, or are there plans to increase it?

Please let me know if you need further details of my implementation, hardware architecture, etc.

Many thanks,

David

0 Kudos
TobiasK
Moderator
1,135 Views

@DavidM1 
please do not reply to such an old post.
We can only help here for the latest release. If you upgrade to 2021.12 and the issue still exists, please open a new thread, providing all necessary details on your environment.

0 Kudos
Rafael_N_
Beginner
2,821 Views

James,

Sorry for insisting on the topic of shared windows' individual size limit, raised by Qinghua W., but I've been facing similar difficulties. According to my tests, when using MPI_Win_allocate_shared, I'm not able to allocate anything beyond the limit of 4GB for a single MPI shared window, just like what was previously reported by Qinghua W.

In my test code, I'm just trying to allocate an array of (mySize x nProcs) floats in total, where "mySize" is the number of floats allocated by each MPI process and nProcs is the number processes.

    float* myPtrA;    
    MPI_Win_allocate_shared( mySize*sizeof(float), sizeof(float), MPI_INFO_NULL, MPI_COMM_WORLD, &myPtrA, &winA );   

No mattering how many MPI processes I have, the total amount of allocated memory in the window "winA" cannot exceed the limit of 4GB, or i'll get an error. In this example, I can allocate up to 1073741823 floats (i.e. mySize x nProcs = 1073741823). When trying to allocate 1073741824 floats, I get the following error:

Fatal error in PMPI_Win_allocate_shared: Other MPI error, error stack:
PMPI_Win_allocate_shared(168).........: MPI_Win_allocate_shared(size=536870912, MPI_INFO_NULL, MPI_COMM_WORLD, baseptr=00000000002DFA00, win=00000000002DFA10) failed
MPID_Win_allocate_shared(232).........:
MPIDI_CH3U_Win_allocate(230)..........:
MPIDI_CH3I_Win_allocate_shm(217)......:
MPIU_SHMW_Seg_create_and_attach(902)..:
MPIU_SHMW_Seg_create_attach_templ(783): unable to attach to shared memory - MapViewOfFile (...).

I'm using Intel(R) MPI Library for Windows* OS, Version 5.0 Update 1 Build 20140709.

Thanks in advance.
 

0 Kudos
James_T_Intel
Moderator
2,821 Views

Using Intel® MPI Library 2017 Update 2, I am able to allocate up to the maximum memory my system has available.  Please try with this version.

0 Kudos
Reply