- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Dear Intel support team,

I have problem with MPI_File_read_all MPI_File_rwrite_all subroutines. I have a fortran code that should read large binary file (~2TB). In this file are few 2D matrices. The largest matrix has size ~0.5TB. I read this file using MPI IO soubrutines something like this:

call MPI_TYPE_CREATE_SUBARRAY(2,dim,loc_sizes,loc_starts,MPI_ORDER_FORTRAN,MPI_DOUBLE_PRECISION,my_subarray,ierr)

call MPI_Type_commit(my_subarray,ierr)

call MPI_File_set_view(filehandle, disp,MPI_DOUBLE_PRECISION,my_subarray, &

"native",MPI_INFO_NULL, ierr)

call MPI_File_read_all(filehandle, float2d, loc_sizes(1)*loc_sizes(2),MPI_DOUBLE_PRECISION,status, ierr)

The problem occurs in MPI_File_read_all call. The number of elements in each submatrices loc_sizes(1)*loc_sizes(2) multiply by the matrix type (8 bytes in Double precision) can not be larger than Integer allowed number 2147483647 (~2GB). In my case each submatrices will have more than 10-20 GB. I tried instead of using integer*4 to use integer*8 but it did not help as MPI subroutine I think transform it again to integer*4. Is there any solution of this problem as you did for example in MPI_File_set_view where displacment type was changed from integer to INTEGER(KIND=MPI_OFFSET_KIND), INTENT(IN) :: disp. The program works fine if the submatrix size is smaller than 2147483647 bytes.

Here is the error message that I got:

forrtl: severe (174): SIGSEGV, segmentation fault occurred

Image PC Routine Line Source

libifcore.so.5 00002ADA8C450876 for__signal_handl Unknown Unknown

libc-2.17.so 00002ADA928C8670 Unknown Unknown Unknown

libmpi.so.12.0 00002ADA91AAEB06 Unknown Unknown Unknown

libmpi.so.12.0 00002ADA91AAF780 Unknown Unknown Unknown

libmpi.so.12.0 00002ADA91AA3039 Unknown Unknown Unknown

libmpi.so.12.0 00002ADA91AA49E4 Unknown Unknown Unknown

libmpi.so.12.0 00002ADA91727370 Unknown Unknown Unknown

libmpi.so.12.0 00002ADA919A1C00 Unknown Unknown Unknown

libmpi.so.12.0 00002ADA91971B90 Unknown Unknown Unknown

libmpi.so.12 00002ADA9193EFF8 MPI_Isend Unknown Unknown

libmpi.so.12.0 00002ADA91695A61 Unknown Unknown Unknown

libmpi.so.12 00002ADA916943B8 ADIOI_GEN_ReadStr Unknown Unknown

libmpi.so.12 00002ADA91A6DDF5 PMPI_File_read_al Unknown Unknown

libmpifort.so.12. 00002ADA912AB4CB mpi_file_read_all Unknown Unknown

jorek_model199 000000000044E747 vacuum_response_m 519 vacuum_response.f90

jorek_model199 000000000044B770 vacuum_response_m 986 vacuum_response.f90

jorek_model199 000000000044A6F4 vacuum_response_m 90 vacuum_response.f90

jorek_model199 000000000041134E MAIN__ 486 jorek2_main.f90

jorek_model199 000000000040C95E Unknown Unknown Unknown

libc-2.17.so 00002ADA928B4B15 __libc_start_main Unknown Unknown

Thank you in advance,

Mochalskyy Serhiy

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

This sounds like is a request to a change to the MPI standard itself, perhaps more appropriate for mpi-forum.org.

Have you looked already into using ILP64 library? http://software.intel.com/en-us/node/528842

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I would like to ask first of all if what I discovered is correct? Can we by means of Intel MPI usings MPI_File_read_all subroutine to read subarray larger than 2GB. For example can 2 MPI tasks read 5 GB unformatted file having 2.5 distributed subarray. Can anyone confirm this issue?

If what I wrote above true I think Intel can change the implementation of the MPI_File_read_all subroutine as the restriction here is inside the subroutine itself. The limitation is as I wrote above that COUNTS_of_subarray_elements*Array_typre_size<2147483647 bytes. This happens inside the subroutine. At least Intel without changing the standard can modify the implementation without multiplitation on the array_type. Therefore, we can increase subarray size for Double Precision up to 16 GB.

And again if what I wrote above true will be in the future any attempts to overcome this restriction? in moder big data technologies 2 GB per MPI task is really not much and not enough for many applications.

Than you Gregg S. for your proposition to use ILP64 library. I will take a look on it.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Gregg S. (Intel) wrote:

Have you looked already into using ILP64 library? http://software.intel.com/en-us/node/528842

I tried to use this library, however the code compilation requires to use 4 byte integer with -i4 compilation option. Therefore, This library which requires to compile the code with -i8 can't solve my problem.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

It is indeed a sledgehammer solution, but let me say this is what commercial software vendors are doing to solve 2 GB limits in MPI.

Alternatively, break the IO into chunks less than 2 GB.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page