Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
公告
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
29280 讨论

Intel Ifort 15 + OpenMPI 1.8.4 + OpenMP= instantaneous segfault

schaefer__brandon
初学者
4,865 次查看

Hi,

using Intel ifort 15, I compiled OpenMPI 1.8.4 using the following configure line:

../configure --prefix=<path to installdir>  --with-openib --with-sge CC=icc FC=ifort CXX=icpc

Unfortunately, compiling our hybrid MPI + OpenMP code with the resulting MPI compiler wrapper results in a binary which segfaults instantaneously after startup.

Please consider the following example which shows the same behavior as our large code:
program test
    use mpi
    integer :: ierr
    real(8) :: a
    call mpi_init(ierr)
    call random_number(a)
    write(*,*)"hello"
    call mpi_finalize(ierr)
end program test


Compiler Versions:
 >mpif90 --version
ifort (IFORT) 15.0.1 20141023
Copyright (C) 1985-2014 Intel Corporation.  All rights reserved.
>icc --version
icc (ICC) 15.0.1 20141023
Copyright (C) 1985-2014 Intel Corporation.  All rights reserved.
>icpc --version
icpc (ICC) 15.0.1 20141023
Copyright (C) 1985-2014 Intel Corporation.  All rights reserved.

How to compile:
>mpif90 -openmp test.f90

Error:
>./a.out
Segmentation fault
 

Please note that this bug has already been reported to the OpenMPI team and they seemed to come to the conclusion that this bug is one the Ifort side ( http://www.open-mpi.org/community/lists/users/2014/11/25834.php )
 

Best,
Bastian

0 项奖励
26 回复数
pbkenned1
员工
1,016 次查看

I have to retract my previous claim that I couldn't reproduce the issue with the original test case.  I missed that you have to compile with -openmp, although the test case contains no OpenMP directives.

So, I've now reproduced the SEGV and reported the issue to the developers, internal tracking ID DPD200367787. 

I was able to get this backtrace with my debug build:

Starting program: /home/hacman/CQ367787.f90-mvapich-omp.x
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
0x000000000040b154 in init_resource ()
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6.x86_64
(gdb) where
#0  0x000000000040b154 in init_resource ()
#1  0x000000000040b0ba in reentrancy_init ()
#2  0x000000000040afd8 in for__reentrancy_init ()
#3  0x00002aaaaaabe6b5 in for_rtl_init_ () from /usr/local/MVAPICH2-2.0.1/lib/libmpichf90.so.12
#4  0x00000000004032b9 in main ()
(gdb)

 

Patrick

0 项奖励
crtierney42
新分销商 I
1,016 次查看

Can anyone provide information about the status of this issue?  

Thanks - Craig

0 项奖励
pbkenned1
员工
1,016 次查看

The MVAPICH2 2.0.1 build causes the bodies of a few FRTL routines to be included in the shared libraries (e.g. for_rtl_init in libmpichf90.so.12). The actual code is included, not just references to be resolved later.  That code is out-of-date and so does not run correctly.

The compiler legitimately puts a call to for_rtl_init, etc. into the generated code.  The link step for MVAPICH2 must be resolving that to a static FRTL library which is not the actual up-to-date FRTL static library.

If we are correct, then this is not a Fortran compiler problem, not a Fortran run-time library problem and not an Intel MPI problem.  It is an MVAPICH2 build problem.

Patrick

0 项奖励
Kai1
初学者
1,016 次查看

Try this:

mpifort -o test test.f90 (assume you put the executable test in current directory same with the test.f90)

mpirun -np $N ./test

I do not know why but 

mpirun -np $N test

gives wrong results (bad termination things etc.). 

0 项奖励
TimP
名誉分销商 III
1,016 次查看

Presumably you are telling mpi to run an incompletely formed shell test command.  If you care to do so, read about the reasons why current working directory isn't included in default path.

0 项奖励
Difu__Sun
初学者
1,016 次查看

Hi ,

I'm having this problem too, is there any workaround?

Thanks

0 项奖励
回复