Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
New Contributor I
8 Views

Conflict between IMSL and MPI

Jump to solution

I am trying to divide my fortran code into several parts and I want to parallelize each part by using MPI. For each part, I use IMSL library to solve an optimization problem (use BCONF). However, I find that IMSL library has its own subroutines about MPI and it does not allow me to call the standard MPI start subroutine "Call MPI_INIT(ierror)". It just gives me an fatal error and ends the program.

I give two examples to illustrate the issue. 

 

Example 1, print "Hello World " from each node: 

program main
   use mpi

   implicit none

  integer ( kind = 4 ) error
  integer ( kind = 4 ) id
  integer ( kind = 4 ) p

  call MPI_Init ( error )

  call MPI_Comm_size ( MPI_COMM_WORLD, p, error )

  call MPI_Comm_rank ( MPI_COMM_WORLD, id, error )


  write ( *, * ) '  Process ', id, ' says "Hello, world!"'


  call MPI_Finalize ( error )

end program

When I compile and run without IMSL library, it gives me the correct answer: 

mpif90 -o a.out hello_mpi.f90

mpiexec -n 4 ./a.out


   Process            3  says "Hello, world!"
   Process            0  says "Hello, world!"
   Process            2  says "Hello, world!"
   Process            1  says "Hello, world!"

 

 

Now If I do nothing to the code but just add IMSL library, it will cause the error:

mpif90 -o a.out hello_mpi.f90 $LINK_FNL_STATIC_IMSL $F90FLAGS

mpiexec  -n 4 ./a.out

 *** FATAL    ERROR 1 from MPI_INIT.   A CALL was executed using the IMSL
 *** FATAL    ERROR 1 from MPI_INIT.   A CALL was executed using the IMSL
 *** FATAL    ERROR 1 from MPI_INIT.   A CALL was executed using the IMSL
 ***          dummy routine.  Parallel performance needs a functioning MPI
 ***          library.
 ***          dummy routine.  Parallel performance needs a functioning MPI
 ***          library.
 ***          dummy routine.  Parallel performance needs a functioning MPI
 ***          library.
 *** FATAL    ERROR 1 from MPI_INIT.   A CALL was executed using the IMSL
 ***          dummy routine.  Parallel performance needs a functioning MPI
 ***          library.

 

In the first example, changing  "$LINK_FNL_STATIC_IMSL" to "LINK_MPI" will cure the problem, but it does not work in a more realistic example here: 

Example 2: use MPI and each node use IMSL library to calculate quadrature nodes

program main
    USE GQRUL_INT
    use mpi

    implicit none

  integer ( kind = 4 ) error
  integer ( kind = 4 ) id
  integer ( kind = 4 ) p
  real ( kind = 8 ) QW(10), QX(10)

  call MPI_Init ( error )

  call MPI_Comm_size ( MPI_COMM_WORLD, p, error )

  call MPI_Comm_rank ( MPI_COMM_WORLD, id, error )


  write ( *, * ) '  Process ', id, ' says "Hello, world!"'
    CALL GQRUL (10, QX, QW )


  call MPI_Finalize ( error )

end program

When I compile and run, program stops at "MPI_INIT":

mpif90 -o a.out hello_mpi.f90 $LINK_FNL_STATIC_IMSL $F90FLAGS

 

mpiexec -n 4 ./a.out 


 *** FATAL    ERROR 1 from MPI_INIT.   A CALL was executed using the IMSL
 ***          dummy routine.  Parallel performance needs a functioning MPI
 ***          library.
 *** FATAL    ERROR 1 from MPI_INIT.   A CALL was executed using the IMSL
 ***          dummy routine.  Parallel performance needs a functioning MPI
 ***          library.
 *** FATAL    ERROR 1 from MPI_INIT.   A CALL was executed using the IMSL
 *** FATAL    ERROR 1 from MPI_INIT.   A CALL was executed using the IMSL
 ***          dummy routine.  Parallel performance needs a functioning MPI
 ***          library.
 ***          dummy routine.  Parallel performance needs a functioning MPI
 ***          library.

 

 

If I change the linking option to $LINK_MPI, the program crashes at the IMSL library subroutine:

mpif90 -o a.out hello_mpi.f90 $LINK_MPI $F90FLAGS

 

mpiexec -n 4 ./a.out

   Process            1  says "Hello, world!"
   Process            0  says "Hello, world!"
   Process            3  says "Hello, world!"
   Process            2  says "Hello, world!"
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source      
a.out              00000000018D5C75  Unknown               Unknown  Unknown
a.out              00000000018D3A37  Unknown               Unknown  Unknown
a.out              000000000188ADC4  Unknown               Unknown  Unknown
a.out              000000000188ABD6  Unknown               Unknown  Unknown
a.out              000000000184BCB9  Unknown               Unknown  Unknown
a.out              000000000184F410  Unknown               Unknown  Unknown
libpthread.so.0    00007EFC178C67E0  Unknown               Unknown  Unknown
a.out              000000000178E634  Unknown               Unknown  Unknown
a.out              000000000178A423  Unknown               Unknown  Unknown
a.out              0000000000430491  Unknown               Unknown  Unknown
a.out              000000000042AACD  Unknown               Unknown  Unknown
a.out              00000000004233D2  Unknown               Unknown  Unknown
a.out              0000000000422FEA  Unknown               Unknown  Unknown
a.out              0000000000422DD0  Unknown               Unknown  Unknown
a.out              0000000000422C9E  Unknown               Unknown  Unknown
libc.so.6          00007EFC16F7BD1D  Unknown               Unknown  Unknown
a.out              0000000000422B29  Unknown               Unknown  Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source      
a.out              00000000018D5C75  Unknown               Unknown  Unknown
a.out              00000000018D3A37  Unknown               Unknown  Unknown
a.out              000000000188ADC4  Unknown               Unknown  Unknown
a.out              000000000188ABD6  Unknown               Unknown  Unknown
a.out              000000000184BCB9  Unknown               Unknown  Unknown
a.out              000000000184F410  Unknown               Unknown  Unknown
libpthread.so.0    00007EFDE2A037E0  Unknown               Unknown  Unknown
a.out              000000000178E634  Unknown               Unknown  Unknown
a.out              000000000178A423  Unknown               Unknown  Unknown
a.out              0000000000430491  Unknown               Unknown  Unknown
a.out              000000000042AACD  Unknown               Unknown  Unknown
a.out              00000000004233D2  Unknown               Unknown  Unknown
a.out              0000000000422FEA  Unknown               Unknown  Unknown
a.out              0000000000422DD0  Unknown               Unknown  Unknown
a.out              0000000000422C9E  Unknown               Unknown  Unknown
libc.so.6          00007EFDE20B8D1D  Unknown               Unknown  Unknown
a.out              0000000000422B29  Unknown               Unknown  Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source      
a.out              00000000018D5C75  Unknown               Unknown  Unknown
a.out              00000000018D3A37  Unknown               Unknown  Unknown
a.out              000000000188ADC4  Unknown               Unknown  Unknown
a.out              000000000188ABD6  Unknown               Unknown  Unknown
a.out              000000000184BCB9  Unknown               Unknown  Unknown
a.out              000000000184F410  Unknown               Unknown  Unknown
libpthread.so.0    00007FBF21C277E0  Unknown               Unknown  Unknown
a.out              000000000178E634  Unknown               Unknown  Unknown
a.out              000000000178A423  Unknown               Unknown  Unknown
a.out              0000000000430491  Unknown               Unknown  Unknown
a.out              000000000042AACD  Unknown               Unknown  Unknown
a.out              00000000004233D2  Unknown               Unknown  Unknown
a.out              0000000000422FEA  Unknown               Unknown  Unknown
a.out              0000000000422DD0  Unknown               Unknown  Unknown
a.out              0000000000422C9E  Unknown               Unknown  Unknown
libc.so.6          00007FBF212DCD1D  Unknown               Unknown  Unknown
a.out              0000000000422B29  Unknown               Unknown  Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source      
a.out              00000000018D5C75  Unknown               Unknown  Unknown
a.out              00000000018D3A37  Unknown               Unknown  Unknown
a.out              000000000188ADC4  Unknown               Unknown  Unknown
a.out              000000000188ABD6  Unknown               Unknown  Unknown
a.out              000000000184BCB9  Unknown               Unknown  Unknown
a.out              000000000184F410  Unknown               Unknown  Unknown
libpthread.so.0    00007F8084FD67E0  Unknown               Unknown  Unknown
a.out              000000000178E634  Unknown               Unknown  Unknown
a.out              000000000178A423  Unknown               Unknown  Unknown
a.out              0000000000430491  Unknown               Unknown  Unknown
a.out              000000000042AACD  Unknown               Unknown  Unknown
a.out              00000000004233D2  Unknown               Unknown  Unknown
a.out              0000000000422FEA  Unknown               Unknown  Unknown
a.out              0000000000422DD0  Unknown               Unknown  Unknown
a.out              0000000000422C9E  Unknown               Unknown  Unknown
libc.so.6          00007F808468BD1D  Unknown               Unknown  Unknown
a.out              0000000000422B29  Unknown               Unknown  Unknown

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 174
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================

 

I am running this code on a UNIX system on my school's supercomputer and I am using intel compiler and MPICH version 3.0.1. My actual code is very similar to the second example, which uses some IMSL subroutines on each node. Can you please help me to make it work? Thank you! 

 

 

 

0 Kudos

Accepted Solutions
Highlighted
New Contributor I
8 Views

I finally found the solution

Jump to solution

I finally found the solution to this problem. I just need to change the flag to $LINK_MPIS$ instead of $LINK_MPI$ and my second code runs without any problem. 

View solution in original post

0 Kudos
7 Replies
Highlighted
8 Views

If you're running this on a

Jump to solution

If you're running this on a UNIX (or Linux, more likely) system, you're not using an IMSL supplied by Intel. You'll need to discuss this issue with RogueWave. Intel sells IMSL only for WIndows.

Retired 12/31/2016
0 Kudos
Highlighted
New Contributor I
8 Views

hi Steve

Jump to solution

hi Steve

Thank you for your reply! I do realize that the IMSL library are slightly different on linux and windows and I can make my code work on my windows laptop. I searched that the IMSL forum under RoguWave is not so active and there are not many people asking or answering questions in the forum. So I think maybe someone will have experience with this issue here. But thank you anyway and I will try to post the same question on their forum to see if anyone can answer the question. 

Hewei

 

Steve Lionel (Intel) wrote:

If you're running this on a UNIX (or Linux, more likely) system, you're not using an IMSL supplied by Intel. You'll need to discuss this issue with RogueWave. Intel sells IMSL only for WIndows.

0 Kudos
Highlighted
Black Belt
8 Views

There's a separate forum for

Jump to solution

There's a separate forum for Linux. 

You must take care that you have set up mpich environment after ifort so that the coarray support doesn't block mpich.

0 Kudos
Highlighted
New Contributor I
8 Views

Thank you Tim. The code can

Jump to solution

Thank you Tim. The code can run without any problem on my windows laptop with IMSL and intel-MPI. Maybe things are a little different when I am running it on linux. I just took a look at the forums but I am not sure which one is appropriate for my question. Do you have any suggestions? Thank you!

 

Tim P. wrote:

There's a separate forum for Linux. 

You must take care that you have set up mpich environment after ifort so that the coarray support doesn't block mpich.

0 Kudos
Highlighted
Black Belt
8 Views

ifort compilervars.bat checks

Jump to solution

ifort compilervars.bat checks to see whether Intel MPI is set up, and avoids breaking the PATH.  compilervars.sh doesn't check for mpich et al, so you must take more care in that case.

If your "supercomputer" was set up by a sysadmin, they should be aware of such issues (e.g. they might install modules)

More likely forum pages would be


https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x

https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology

 

0 Kudos
Highlighted
8 Views

However, the error messages

Jump to solution

However, the error messages suggest to me that it is something IMSL is assuming that it isn't seeing, and you're most likely to get help for that from RogueWave.

Retired 12/31/2016
0 Kudos
Highlighted
New Contributor I
9 Views

I finally found the solution

Jump to solution

I finally found the solution to this problem. I just need to change the flag to $LINK_MPIS$ instead of $LINK_MPI$ and my second code runs without any problem. 

View solution in original post

0 Kudos