- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, I have been playing around with a simple program within the MPI framework. The idea is to construct a row-wise partitioned matrix and then simply call the MPI_ALLGATHERV function to collect the complete matrix on all cpus (assuming that the matrix is not particularly large but evaluation of individual elements is independent and pretty expensive). One possibility how to collect the data would be to iterate over the columns of the matrix and call MPI_ALLGATHERV on each column independently. However, I tried to do it in a more MPI-like fashion. To this end, I defined a custom MPI type using MPI_TYPE_VECTOR (as shown in the minimalistic example below) in order to exploit only one call of MPI_ALLGATHERV.
This program works (or seems to work) correctly when compiled in a straightforward fashion:
[bash]
mpiifort -o gather.lp64 gather.f90
mpirun -n 2 ./gather.lp64
[/bash]
Moreover, the results are independent of the optimization level.
However, for certain reasons, I would need to use the ILP64 interface. Following the instructions from the MPI manual, I compiled and executed the program like this:
[bash]
mpiifort -f90=ifort -fc=ifort -c -warn all -O1 -i8 -I$MKLROOT/include/intel64/ilp64 -I${MKLROOT}/include -I${I_MPI_ROOT}/include64 -o gather.o gather.f90
mpiifort -f90=ifort -fc=ifort -ilp64 -warn all -i8 -o gather.ilp64 gather.o ${MKLROOT}/lib/intel64/libmkl_blas95_ilp64.a -L${MKLROOT}/lib/intel64 -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_ilp64 -lpthread -lm
mpirun -ilp64 -n 2 ./gather.ilp64
[/bash]
Now, the strange thing is that this produces the expected results with -O0, -O2, and -O3, nevertheless a Segmentation fault pops in with -O1. Strangely enough, this segmentation fault disappears when one removes ${MKLROOT}/lib/intel64/libmkl_blas95_ilp64.a - however, I need this library for certain BLAS ILP64 calls (not used in the minimalistic example below).
I am using ifort Version 13.0.1.117 Build 20121010 and Intel MPI library v. 4.0.3.008.
Any ideas what might be wrong? Perhaps some arguments of the MPI calls are still supposed to be INTEGER(KIND=4) even with -i8? For example, in case of the MKL, the manual mentions that one should check the header files in order to find out the correct kinds, nevertheless the mpif.h header (recommended for ILP64) didn't provide me with any additional insight...
[fortran]
PROGRAM gather
IMPLICIT NONE
INCLUDE 'mpif.h'
!
INTEGER, PARAMETER :: dp = KIND(1D0)
INTEGER, PARAMETER :: number_of_states = 2
INTEGER, PARAMETER :: number_of_points = 7
!
INTEGER :: i
INTEGER :: nproc, my_id, ierr
INTEGER(KIND = MPI_ADDRESS_KIND) :: lb, extent
INTEGER :: ROW_TYPE, ROW_TYPE_RESIZED
!
REAL(dp), DIMENSION(:, :), ALLOCATABLE :: psi, psi_local, psi_local_tr
INTEGER, ALLOCATABLE :: number_of_points_per_proc(:), gather_displ_points(:)
INTEGER :: points_start_index, points_end_index
!
CALL MPI_INIT(ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nproc, ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, my_id, ierr)
!
ALLOCATE(gather_displ_points(0:nproc-1), number_of_points_per_proc(0:nproc-1))
!
IF(nproc > 1) THEN
number_of_points_per_proc(0:nproc-2) = number_of_points / nproc
number_of_points_per_proc(nproc-1) = number_of_points - SUM(number_of_points_per_proc(0:nproc-2))
ELSE
number_of_points_per_proc(0) = number_of_points
END IF
gather_displ_points(0) = 0
DO i = 0, nproc - 2
gather_displ_points(i + 1) = gather_displ_points(i) + number_of_points_per_proc(i)
END DO
!
points_start_index = gather_displ_points(my_id) + 1
points_end_index = points_start_index + number_of_points_per_proc(my_id) - 1
!
ALLOCATE(psi_local(points_start_index:points_end_index, number_of_states))
ALLOCATE(psi_local_tr(number_of_states, points_start_index:points_end_index))
ALLOCATE(psi(number_of_points, number_of_states))
!
CALL MPI_TYPE_VECTOR(number_of_states, 1, number_of_points, MPI_REAL8, ROW_TYPE, ierr)
CALL MPI_TYPE_COMMIT(ROW_TYPE, ierr)
CALL MPI_TYPE_GET_EXTENT(MPI_REAL8, lb, extent, ierr)
CALL MPI_TYPE_CREATE_RESIZED(ROW_TYPE, lb, extent, ROW_TYPE_RESIZED, ierr)
CALL MPI_TYPE_COMMIT(ROW_TYPE_RESIZED, ierr)
!
psi = 0
psi_local = 8 + my_id
psi_local_tr = TRANSPOSE(psi_local)
!
IF(my_id .EQ. 0) THEN
WRITE(*, *) "calling ALLGATHER"
WRITE(*, *) number_of_points_per_proc
WRITE(*, *) gather_displ_points
END IF
!
CALL MPI_ALLGATHERV( &
psi_local_tr, number_of_points_per_proc(my_id)*number_of_states, MPI_REAL8, &
psi, number_of_points_per_proc, gather_displ_points, ROW_TYPE_RESIZED, MPI_COMM_WORLD, ierr)
!
IF(my_id .EQ. 0) THEN
DO i = 1, number_of_points
WRITE(*, *) psi(i, :)
END DO
END IF
!
DEALLOCATE(psi, psi_local, psi_local_tr)
CALL MPI_FINALIZE(ierr)
END PROGRAM
[/fortran]
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your suspicion is correct - you cannot use -i8 to force integers to KIND=8 and then use those in MPI calls - MPI expects KIND=4 integers.
You will get stack corruption and odd seg faults like you see.
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Perhaps I am missing something, nevertheless judging by the example from the Intel MPI manual, I would say that the arguments in the MPI calls are 8-bit (in the ILP64 case), aren't they? The numeric literals, e.g., in MPI_SEND, will be also 8-bit with -i8, or am I mistaken?
http://software.intel.com/sites/products/documentation/hpc/ics/impi/41/lin/Reference_Manual/6_1_Using_ILP64.htm
Also, section 3.5.6.2. of the Intel MPI Reference manual mentions that: Use the mpif.h file instead of the MPI module in Fortran90* applications. The Fortran module supports 32-bit INTEGER size only. So I was wondering what would be the point, if the MPI calls supported exclusively 4-byte integers...
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page