- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In the Intel MPI library reference manual it states that MPI_TAG_UB (under the section Using ILP64 - Known Issues and Limitations)
Predefined commuicator attribugs ... MPI_TAG_UB ... are returned by the functions MPI_GET_ATTR and MPI_COMM_GET_ATTR as 4-byte integers
I've attached a short test case (test.f90) I'm running under 64-bit linux using ifort 16.0.1 and Intel MPI 5.1.2. When I enable the '-check all' compiler flag (test.sh), the run time checks fail whenever I pass MPI_TAG_UB as a 4-byte integer to the MPI_COMM_GET_ATTR function but pass whenever I pass MPI_TAG_UB as an 8-byte integer. If I do not enable the '-check all' compiler flag (test_nocheck.sh), then the code seems to run ok for either 4-byte or 8-byte MPI_TAG_UB
Is the manual correct regarding this or are the run-time checks hitting some other issue?
Thanks,
John
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
John,
You provided two Shell scripts called "test_nocheck.sh1" and "test.sh1". Each Shell script references the driver called "mpif90". The "mpif90" driver uses "gfortran". Error messages of the form:
f951: error: unrecognized command line option "-i8"
f951: error: unrecognized command line option "-fpp"
are appearing during compilation.
-Steve
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve,
I believe this is an issue with your build environment. In my environment, mpif90 is most definitely calling ifort and not gfortran. The problem should occur if you change the build script to make sure that the Intel fortran compiler is being used.
Thanks,
John
Steve H. (Intel) wrote:
John,
You provided two Shell scripts called "test_nocheck.sh1" and "test.sh1". Each Shell script references the driver called "mpif90". The "mpif90" driver uses "gfortran". Error messages of the form:
f951: error: unrecognized command line option "-i8"
f951: error: unrecognized command line option "-fpp"are appearing during compilation.
-Steve
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
John,
You are incorrect. There is nothing wrong with the build environment. By default for Intel MPI Library, one can simply call "mpiifort" if one wishes to use the Intel Fortran compiler for the various Fortran Programming Language Standards (this assumes that the patron has the Intel Fortran Compiler installed). There are two sets of MPI compilation drivers in the Intel MPI Library ".../bin" directory. The first set of MPI compilation drivers, by default, use the GNU compilers, and the second set of MPI compilation drivers, which begin with the prefix "mpii...", reference the Intel compilers.
Questions:
1) Could you please verify that you have the following MPI compiler drivers in your ".../bin" directory for Intel MPI Library:
mpicc mpicxx mpif77 mpif90 mpifc mpigcc mpigxx mpiicc mpiicpc mpiifort
2) For the two scripts that you provided: "test_nocheck.sh1" and "test.sh1", should there be an "mpirun" command at about line 17?
13 echo
14 echo "====================="
15 echo "4-byte mpi interface test, 4 byte system integer, 8-byte ATTRIB "
16 mpif90 -fpp -check all -DMPI_MPI_INTEGER_TYPE=4 -DMPI_SYS_INTEGER_TYPE=4 test.f90
17
Thank you,
-Steve
John Y. wrote:
Steve,
I believe this is an issue with your build environment. In my environment, mpif90 is most definitely calling ifort and not gfortran. The problem should occur if you change the build script to make sure that the Intel fortran compiler is being used.
Thanks,
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
John,
For the two scripts "test_nocheck.sh1" and "test.sh1", would you please check my work? The scripts where modified from something like (line numbers are included):
12 echo
13 echo "====================="
14 echo "4-byte mpi interface test, 4 byte system integer, 8-byte ATTRIB "
15 mpif90 -fpp -DMPI_MPI_INTEGER_TYPE=4 -DMPI_SYS_INTEGER_TYPE=4 test.f90
16
to:
12 echo
13 echo "====================="
14 echo "4-byte mpi interface test, 4 byte system integer, 8-byte ATTRIB "
15 mpif90 -fpp -DMPI_MPI_INTEGER_TYPE=4 -DMPI_SYS_INTEGER_TYPE=4 test.f90
16 mpirun -n 1 ./a.out
17
Does the modified excerpt look proper? Please note that for the "test.sh1" script, the "-check all" command-line option is part of the "mpif90" driver compilation.
Secondly, here is an excerpt of what I am seeing for the "4-byte mpi interface test, 4-byte system integer, 4-byte ATTRIB " test-case via "test.sh1":
#-----------------------------
echo
echo "====================="
echo "4-byte mpi interface test, 4-byte system integer, 4-byte ATTRIB "
mpif90 -fpp -check all -DMPI_MPI_INTEGER_TYPE=4 -DMPI_SYS_INTEGER_TYPE=4 -DFOUR_BYTE_ATTRIB test.f90
mpirun -n 1 ./a.out
The output is:
mpif90 for the Intel(R) MPI Library 5.1.3 for Linux*
Copyright(C) 2003-2015, Intel Corporation. All rights reserved.
ifort version 16.0.3
=====================
4-byte mpi interface test, 4-byte system integer, 4-byte ATTRIB
NPROC: 1
RANK: 1
I8B I4V syst_int_kind MPI_ADDRESS_KIND kind(attrib_val)
8 4 4 8 4
ATTRIB_VAL: 2147483647
Boundary Run-Time Check Failure for variable 'test_$ATTRIB_VAL'
forrtl: error (76): Abort trap signal
Image PC Routine Line Source
a.out 0000000000478575 Unknown Unknown Unknown
a.out 0000000000476197 Unknown Unknown Unknown
a.out 00000000004293B4 Unknown Unknown Unknown
a.out 00000000004291C6 Unknown Unknown Unknown
a.out 0000000000405506 Unknown Unknown Unknown
a.out 0000000000409318 Unknown Unknown Unknown
libpthread.so.0 00000033A460F710 Unknown Unknown Unknown
libc.so.6 00000033A3A32625 Unknown Unknown Unknown
libc.so.6 00000033A3A33E05 Unknown Unknown Unknown
a.out 0000000000478943 Unknown Unknown Unknown
a.out 0000000000403DB4 MAIN__ 64 test.f90
a.out 000000000040351E Unknown Unknown Unknown
libc.so.6 00000033A3A1ED5D Unknown Unknown Unknown
a.out 0000000000403429 Unknown Unknown Unknown
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 28434 RUNNING AT impic.clusterlab.intel.com
= EXIT CODE: 6
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 28434 RUNNING AT impic.clusterlab.intel.com
= EXIT CODE: 6
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
I believe that the key components from the "test.f90" source are:
Integer, parameter :: I4B = selected_int_kind(9) ! 4-byte integer
integer, parameter, public :: mpi_int_kind = I4B
integer(I4B) :: attrib_val ! Accordint to intel mpi mautal, MPI_TAB_UB attribute always a 4-byte integer ??
call MPI_COMM_GET_ATTR(MPI_COMM_WORLD,MPI_TAG_UB,attrib_val,flag,ierr)
where MPI_COMM_GET_ATTR has the prototype:
MPI_COMM_GET_ATTR(INTEGER COMM, INTEGER COMM_KEYVAL, INTEGER(KIND=MPI_ADDRESS_KIND) ATTRIBUTE_VAL, LOGICAL FLAG, INTEGER IERROR)
The Fortran parameter MPI_ADDRESS_KIND is defined as:
INTEGER MPI_ADDRESS_KIND
PARAMETER (MPI_ADDRESS_KIND=8)
I believe that when the "-check all" compilation option is used, the Fortran runtime system is catching a precision mismatch between the variable "attrib_val" and the formal argument to "MPI_COMM_GET_ATTR" which has the data-type attributes "INTEGER(KIND=MPI_ADDRESS_KIND) ATTRIBUTE_VAL". For the script "test.sh1", you are consistently getting the line of output:
ATTRIB_VAL: 2147483647
for each experiment that completes with the runtime error or completes without the runtime error.
At the URL:
https://software.intel.com/sites/default/files/managed/be/4e/intel-mpi-5.1.3-developer-reference-linux.pdf
under the section titled, "Using ILP64" and the subsection titled, "Known Issues and Limitations", you pointed out that the manual states: "Predefined communicator attributes MPI_APPNUM, MPI_HOST, MPI_IO, MPI_LASTUSEDCODE, MPI_TAG_UB, MPI_UNIVERSE_SIZE, and MPI_WTIME_IS_GLOBAL are returned by the functions MPI_GET_ATTR and MPI_COMM_GET_ATTR as 4-byte integers."
As you mentioned in your initial inquiry, the script "test_nocheck.sh1" completes without error, and I believe that the manual is correct in regards to the functions MPI_GET_ATTR and MPI_COMM_GET_ATTR returning 4-byte integers. I hope that this information helps.
Thank you,
-Steve
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve H. (Intel) wrote:
For the two scripts "test_nocheck.sh1" and "test.sh1", would you please check my work? The scripts where modified from something like (line numbers are included):
......
At the URL:
under the section titled, "Using ILP64" and the subsection titled, "Known Issues and Limitations", you pointed out that the manual states: "Predefined communicator attributes MPI_APPNUM, MPI_HOST, MPI_IO, MPI_LASTUSEDCODE, MPI_TAG_UB, MPI_UNIVERSE_SIZE, and MPI_WTIME_IS_GLOBAL are returned by the functions MPI_GET_ATTR and MPI_COMM_GET_ATTR as 4-byte integers."
As you mentioned in your initial inquiry, the script "test_nocheck.sh1" completes without error, and I believe that the manual is correct in regards to the functions MPI_GET_ATTR and MPI_COMM_GET_ATTR returning 4-byte integers. I hope that this information helps.
Steve,
1. Your modifications look good. Thanks for confirming the behavior.
2. As I understand it then, MPI_TAG_UB should be defined as MPI_ADDRESS_KIND which would be an 8-byte integer on a 64-bit system. So, when the manual says MPI_TAG_UB is "returned by the functions MPI_GET_ATTR and MPI_COMM_GET_ATTR as 4-byte integers" it is only talking about the value and not the data type.
Honestly, I find the manual misleading on this issue because it says 'returned by ... as 4-byte integers'. To me this is indicating that the data type itself is a 4-byte integer. I suggest that, if possible, the manual be modified to clarify this point that the actual data type of the values are not necessarily 4-byte integers but just that the maximum value returned is that of a 4-byte integer.
Thanks for all your help. I believe the issue is clear now.
John
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page