- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Both with some older (17.0.2) and the newest (19.0.3) Intel Fortran Compiler + Intel MPI I experience problems with the function MPI_File_set_view when using the 64-bit integer Fortran interface of Intel MPI (aka ILP64, using compiler switch -i8).
Whenever I set the write offset argument "disp" of this function outside of the int32 range, the call fails with the error code 201389836. However, the argument is of kind MPI_OFFSET_KIND, so it is supposed to support large values without any problems.
Curiously, this failure happens only with the ILP64 version, not with the LP64 version, which is the other way round than one would expect. As if there was an erroneous range check somewhere in the ILP64 interface before calling the underlying LP64 implementation, which actually supports large offsets.
Below is an example program that demonstrates the issue. The program writes an exactly 2-GiB integer array to a file, starting at offset 0. Then it attempts to position the next writing view at the end of the just written chunk and write one extra integer. It works well with Open MPI 4.0.0 ILP64 and Intel MPI 17.0.2/19.0.3 LP64 (prints 0) but fails with Intel MPI 17.0.2/19.0.3 ILP64 (prints 201389836). NB: This sample program is intended to be executed in single process only.
Did I hit a bug in Intel MPI?
program mpi_io_offset use iso_fortran_env, only: int32 use mpi implicit none integer(int32), parameter :: mpiint = kind(MPI_COMM_WORLD) integer(int32), parameter :: mpiofs = MPI_OFFSET_KIND integer(mpiint) :: ierr, fh, stat(MPI_STATUS_SIZE), one = 1, num = 2**29 integer(mpiofs) :: zero = 0, two_GiB_bytes integer(int32) :: four_B_int = -1 integer(int32), allocatable :: two_GiB_array(:) allocate (two_GiB_array(num)) two_GiB_array(:) = 0 two_GiB_bytes = num * 4_mpiofs call MPI_Init(ierr) call MPI_File_open(MPI_COMM_WORLD, 'file.bin', MPI_MODE_CREATE + MPI_MODE_WRONLY, MPI_INFO_NULL, fh, ierr) call MPI_File_set_size(fh, zero, ierr) call MPI_File_set_view(fh, zero, MPI_INTEGER4, MPI_INTEGER4, 'native', MPI_INFO_NULL, ierr) call MPI_File_write_all(fh, two_GiB_array, num, MPI_INTEGER4, stat, ierr) call MPI_File_set_view(fh, two_GiB_bytes, MPI_INTEGER4, MPI_INTEGER4, 'native', MPI_INFO_NULL, ierr) print *, ierr call MPI_File_write_all(fh, four_B_int, one, MPI_INTEGER4, stat, ierr) call MPI_File_close(fh, ierr) call MPI_Finalize(ierr) end program mpi_io_offset
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This problem still persists in Intel oneAPI 2021 (or more specifically Intel(R) MPI Library for Linux* OS, Version 2021.1 Build 20201112).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The program still fails in ILP64 mode with Intel(R) MPI Library for Linux* OS, Version 2021.5 Build 20211102.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This program still fails in ILP64 mode with Intel(R) MPI Library for Linux* OS, Version 2021.7 Build 20220909.
$ mpiexec --version
Intel(R) MPI Library for Linux* OS, Version 2021.7 Build 20220909 (id: 6b6f6425df)
Copyright 2003-2022, Intel Corporation.
$ ifx --version
ifx (IFORT) 2022.2.0 20220730
Copyright (C) 1985-2022 Intel Corporation. All rights reserved.
$ mpifc -fc=ifx test.f90 -o test
$ ./test
0
$ mpifc -fc=ifx test.f90 -o test -i8
$ ./test
201385484
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To make some progress on this issue I have been debugging the library "/opt/intel/oneapi/mpi/2021.10.0/lib/release/libmpi_ilp64.so.4.1" in my installation of the latest Intel MPI. This is the disassembly of the broken function:
(gdb) disassemble mpi_file_set_view_
Dump of assembler code for function mpi_file_set_view__:
0x000000000001f9e0 <+0>: push r12
0x000000000001f9e2 <+2>: push r13
0x000000000001f9e4 <+4>: push r14
0x000000000001f9e6 <+6>: push r15
0x000000000001f9e8 <+8>: push rbx
0x000000000001f9e9 <+9>: push rbp
0x000000000001f9ea <+10>: sub rsp,0x28
0x000000000001f9ee <+14>: mov rbx,r8
0x000000000001f9f1 <+17>: mov r15,QWORD PTR [rip+0x2225e8] # 0x241fe0
0x000000000001f9f8 <+24>: mov rbp,rcx
0x000000000001f9fb <+27>: mov r12,rdx
0x000000000001f9fe <+30>: mov r13,rsi
0x000000000001fa01 <+33>: mov r14,rdi
0x000000000001fa04 <+36>: cmp DWORD PTR [r15],0x0
0x000000000001fa08 <+40>: jne 0x1facf <mpi_file_set_view__+239>
0x000000000001fa0e <+46>: movsxd rdx,DWORD PTR [r13+0x0]
0x000000000001fa12 <+50>: mov QWORD PTR [rsp],rdx
0x000000000001fa16 <+54>: mov rdx,QWORD PTR [r12]
0x000000000001fa1a <+58>: mov eax,DWORD PTR [r14]
0x000000000001fa1d <+61>: mov DWORD PTR [rsp+0x8],eax
0x000000000001fa21 <+65>: lea rcx,[rdx-0x4c000405]
0x000000000001fa28 <+72>: cmp rcx,0x40
0x000000000001fa2c <+76>: jae 0x1fa48 <mpi_file_set_view__+104>
0x000000000001fa2e <+78>: mov eax,0x1
0x000000000001fa33 <+83>: shl rax,cl
0x000000000001fa36 <+86>: test rax,0x1400001
0x000000000001fa3c <+92>: je 0x1fa48 <mpi_file_set_view__+104>
0x000000000001fa3e <+94>: mov DWORD PTR [rsp+0xc],0x4c000809
0x000000000001fa46 <+102>: jmp 0x1fa4c <mpi_file_set_view__+108>
0x000000000001fa48 <+104>: mov DWORD PTR [rsp+0xc],edx
0x000000000001fa4c <+108>: mov rdx,QWORD PTR [rbp+0x0]
0x000000000001fa50 <+112>: lea rcx,[rdx-0x4c000405]
0x000000000001fa57 <+119>: cmp rcx,0x40
0x000000000001fa5b <+123>: jae 0x1fa77 <mpi_file_set_view__+151>
0x000000000001fa5d <+125>: mov eax,0x1
0x000000000001fa62 <+130>: shl rax,cl
0x000000000001fa65 <+133>: test rax,0x1400001
0x000000000001fa6b <+139>: je 0x1fa77 <mpi_file_set_view__+151>
0x000000000001fa6d <+141>: mov DWORD PTR [rsp+0x10],0x4c000809
0x000000000001fa75 <+149>: jmp 0x1fa7b <mpi_file_set_view__+155>
0x000000000001fa77 <+151>: mov DWORD PTR [rsp+0x10],edx
0x000000000001fa7b <+155>: mov r9d,DWORD PTR [r9]
0x000000000001fa7e <+158>: lea rbp,[rsp+0x18]
0x000000000001fa83 <+163>: movsxd rax,DWORD PTR [rsp+0x68]
0x000000000001fa88 <+168>: mov r8,rbx
0x000000000001fa8b <+171>: mov DWORD PTR [rbp-0x4],r9d
0x000000000001fa8f <+175>: push rax
0x000000000001fa90 <+176>: push rbp
0x000000000001fa91 <+177>: lea rdi,[rsp+0x18]
0x000000000001fa96 <+182>: lea rsi,[rsp+0x10]
0x000000000001fa9b <+187>: lea rdx,[rsp+0x1c]
0x000000000001faa0 <+192>: lea rcx,[rsp+0x20]
0x000000000001faa5 <+197>: lea r9,[rsp+0x24]
0x000000000001faaa <+202>: call 0x19600 <pmpi_file_set_view_@plt>
0x000000000001faaf <+207>: add rsp,0x10
0x000000000001fab3 <+211>: mov rdx,QWORD PTR [rsp+0x60]
0x000000000001fab8 <+216>: movsxd rax,DWORD PTR [rsp+0x18]
0x000000000001fabd <+221>: mov QWORD PTR [rdx],rax
0x000000000001fac0 <+224>: add rsp,0x28
0x000000000001fac4 <+228>: pop rbp
0x000000000001fac5 <+229>: pop rbx
0x000000000001fac6 <+230>: pop r15
0x000000000001fac8 <+232>: pop r14
0x000000000001faca <+234>: pop r13
0x000000000001facc <+236>: pop r12
0x000000000001face <+238>: ret
0x000000000001facf <+239>: mov QWORD PTR [rsp],r9
0x000000000001fad3 <+243>: call 0x19100 <ilp64_mpirinitf_@plt>
0x000000000001fad8 <+248>: mov r9,QWORD PTR [rsp]
0x000000000001fadc <+252>: mov DWORD PTR [r15],0x0
0x000000000001fae3 <+259>: jmp 0x1fa0e <mpi_file_set_view__+46>
0x000000000001fae8 <+264>: nop DWORD PTR [rax+rax*1+0x0]
The problem, as I understand it, happens on lines "+46" and "+50". The contents of the register `r13` (originally `rsi`, see +30) are interpreted as `int32_t*` and copied to an 8-byte slot at address pointed to by the stack pointer `rsp`. But this register holds an MPI_Offset value (the `disp` parameter), and such narrowing obviously crops the high part of it. Association of some other registers is `rdi` = `fh`, `rdx` = `etype`, `rcx` = `filetype` and `r9` = `info`.
To prove that there is a narrowing of an 8-byte integer to a 4-byte integer I used the MPI profiling interface and wrote an interception implementation of `pmpi_file_set_view_`:
#include <mpi.h>
#include <stdio.h>
void pmpi_file_set_view (int* fh, MPI_Offset* disp, int* etype, int* filetype, const char* datarep, int* info, int* err, int datarep_len);
void pmpi_file_set_view_ (int* fh, MPI_Offset* disp, int* etype, int* filetype, const char* datarep, int* info, int* err, int datarep_len)
{
printf("[pmpi_file_set_view_] disp: o%llo\n", *disp);
pmpi_file_set_view(fh, disp, etype, filetype, datarep, info, err, datarep_len);
}
I used it together with a modified reproducer from the very first post:
program mpi_io_offset
use iso_fortran_env, only: int32, int64
use mpi
implicit none
integer(int32), parameter :: mpiint = kind(MPI_COMM_WORLD)
integer(int32), parameter :: mpiofs = MPI_OFFSET_KIND
integer(mpiint) :: ierr, stat(MPI_STATUS_SIZE), one = 1, nelem = int(o'12345671234', mpiint)
integer(mpiofs) :: zero = 0, bytes
integer(mpiint), target :: fh
integer(int64) :: number = -1
integer(int64), allocatable :: array(:)
allocate (array(nelem))
array = 0
bytes = nelem
bytes = bytes * bit_size(number)/8
call MPI_Init(ierr)
call MPI_File_open(MPI_COMM_WORLD, 'file.bin', MPI_MODE_CREATE + MPI_MODE_WRONLY, MPI_INFO_NULL, fh, ierr)
call MPI_File_set_size(fh, zero, ierr)
call MPI_File_set_view(fh, zero, MPI_INTEGER8, MPI_INTEGER8, 'native', MPI_INFO_NULL, ierr)
call MPI_File_write_all(fh, array, nelem, MPI_INTEGER8, stat, ierr)
call MPI_File_set_view(fh, bytes, MPI_INTEGER8, MPI_INTEGER8, 'native', MPI_INFO_NULL, ierr)
print *, ierr
call MPI_File_write_all(fh, number, one, MPI_INTEGER8, stat, ierr)
call MPI_File_close(fh, ierr)
call MPI_Finalize(ierr)
end program mpi_io_offset
This time , the program creates a large (~ 10 GiB) array of 8-byte integers. The byte offset (`disp`) of the second call to `MPI_File_set_view` is (in octal) `0123456712340`, so that one can see well where it is cropped.. I compile the codes with the following Makefile:
all:
icx -fPIC -c -debug full -O0 -traceback pmpi_file_set_view.c -o pmpi_file_set_view.o
icx -shared pmpi_file_set_view.o -o libpmpi_file_set_view.so
ifort -i8 test_intel_mpi.f90 -L/opt/intel/oneapi/mpi/2021.10.0/lib/release -o test_ilp64.x -lmpi_ilp64 -debug full
I can then run the test as
LD_PRELOAD=$(pwd)/libpmpi_file_set_view.so ./test_ilp64.x
The output is
[pmpi_file_set_view_] disp: o0
[pmpi_file_set_view_] disp: o1777777777763456712340
201385484
Apparently, the low end of the `disp` parameter is passed correctly to the interception function, but not the high end. That one is most likely lost due to a bug in the ILP64-to-LP64 translation in `mpi_file_set_view_` from "libmpi_ilp64.so".

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page