Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28456 Discussions

OpenMPI, intel 16, undefined references

Nick_Papior
Beginner
2,199 Views

I have also posted this on OpenMPI devel list (http://www.open-mpi.org/community/lists/devel/2015/10/18311.php), see that post for more details.

Basically I can compile OpenMPI and use the mpif90 compiler successfully using intel 15 compiler. However, when switching to the 16 compiler I get undefined references upon execution (no matter the fortran code, the simplest code will error out):

intel: symbol lookup error: */XeonX5550/openmpi/1.10.1/intel-16.0.0/lib/libmpi_mpifh.so.12: undefined symbol: mpi_fortran_argvs_null__ 

Looking at the symbols in the openmpi libraries they are equivalent when compiling with intel 15 and 16. Hence the libraries are identical in symbol tables but running with intel 16 forces the symbol lookup which always fails. The symbol is defined in the common undefined section of the library (nm -> B)

Procedure, 1) compile openmpi using intel, 2) compile any fortran program using the mpif90 compiler, 3) run executable.

What has changed in the way intel runs its executables? Are all symbols required to be found at initialization?

0 Kudos
1 Solution
Lorri_M_Intel
Employee
2,199 Views

LD_LIBRARY_PATH is used to find shared libraries at *image activation* time, not at link time.

So, it would be relevant to a problem that presented itself at runtime, but not at link time.

I see one thing that's interesting.  In the other post, you'd said that you'd found the undefined symbol in libmpi.so:

libmpi.so
00000000002fb160 B mpi_fortran_argv_null
00000000002fb080 B mpi_fortran_argv_null_

 

Which libmpi.so was that?  Is it one from this directory:

     /zdata/groups/common/nicpa/2015-oct/XeonX5550/openmpi/1.10.1/intel-16.0.0/lib

or from this directory?

     /opt/intel/2016//compilers_and_libraries_2016.0.109/linux/mpi/intel64/lib

My concern is that, at link time, the libmpi.so from the ' /opt/intel/2016//compilers_and_libraries_2016.0.109/linux/mpi/intel64/lib' directory is being found instead of the libmpi.so from the ' /zdata/groups/common/nicpa/2015-oct/XeonX5550/openmpi/1.10.1/intel-16.0.0/lib' directory, and that the wrong libmpi.so is pulling in our libmpi.so.12 instead of the libmpi.so in the Open-MPI directory.

Did that make sense?

To answer your other question, no, Intel Fortran has not changed its handling of symbols between 15  and 16.

It is, however, new to have Intel MPI installed by default, so I'm wondering if its a configuration-type problem and thus resolveable.

Note please, that your ldd output requests the Intel MPI not the OpenMPI library.   Again, leading me to believe it's a configuration problem.

             --Lorri

 

 

 

View solution in original post

0 Kudos
8 Replies
Lorri_M_Intel
Employee
2,199 Views

What does your LD_LIBRARY_PATH look like?

            --Lorri

0 Kudos
Nick_Papior
Beginner
2,199 Views

Here:

LD_LIBRARY_PATH=/opt/intel/2016//compilers_and_libraries_2016.0.109/linux/mpi/intel64/lib:/opt/intel/2016///itac/9.1.1.017/mic/slib:/opt/intel/2016///itac/9.1.1.017/intel64/slib:/opt/intel/2016///itac/9.1.1.017/mic/slib:/opt/intel/2016///itac/9.1.1.017/intel64/slib:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/compiler/lib/intel64:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/mpi/intel64/lib:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/ipp/../compiler/lib/intel64:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/ipp/lib/intel64:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/compiler/lib/intel64:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/mkl/lib/intel64:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/tbb/lib/intel64/gcc4.4:/opt/intel/2016/debugger_2016/libipt/intel64/lib:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/daal/lib/intel64_lin:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/daal/../compiler/lib/intel64_lin:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/compiler/lib/intel64:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/mkl/lib/intel64:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/ipp/../compiler/lib/intel64:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/ipp/lib/intel64:/opt/intel/2016/compilers_and_libraries_2016.0.109/linux/tbb/lib/intel64/gcc4.4:/apps/dcc/lib/opengl

I cannot see how the LD_LIBRARY_PATH could have anything to do with this? Would you care to elaborate your thoughts?

Doing `ldd <exec>` reveals:

	linux-vdso.so.1 =>  (0x00007fff257cb000)
	libmpi.so.12 => /opt/intel/2016//compilers_and_libraries_2016.0.109/linux/mpi/intel64/lib/libmpi.so.12 (0x00002b8e551bd000)
	libmpi_mpifh.so.12 => /zdata/groups/common/nicpa/2015-oct/XeonX5550/openmpi/1.10.1/intel-16.0.0/lib/libmpi_mpifh.so.12 (0x00002b8e55975000)
	libmpi_usempif08.so.11 => /zdata/groups/common/nicpa/2015-oct/XeonX5550/openmpi/1.10.1/intel-16.0.0/lib/libmpi_usempif08.so.11 (0x00002b8e55bdc000)
	libmpi_usempi_ignore_tkr.so.6 => /zdata/groups/common/nicpa/2015-oct/XeonX5550/openmpi/1.10.1/intel-16.0.0/lib/libmpi_usempi_ignore_tkr.so.6 (0x00002b8e55eac000)
	libm.so.6 => /lib64/libm.so.6 (0x00002b8e56179000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b8e563fd000)
	libc.so.6 => /lib64/libc.so.6 (0x00002b8e5661b000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003082a00000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00002b8e569af000)
	librt.so.1 => /lib64/librt.so.1 (0x00002b8e56bb4000)
	/lib64/ld-linux-x86-64.so.2 (0x00002b8e54f9b000)
	libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x0000003078e00000)
	libopen-rte.so.12 => /zdata/groups/common/nicpa/2015-oct/XeonX5550/openmpi/1.10.1/intel-16.0.0/lib/libopen-rte.so.12 (0x00002b8e56dbd000)
	libopen-pal.so.13 => /zdata/groups/common/nicpa/2015-oct/XeonX5550/openmpi/1.10.1/intel-16.0.0/lib/libopen-pal.so.13 (0x00002b8e57052000)
	libutil.so.1 => /lib64/libutil.so.1 (0x00002b8e57316000)
	libhwloc.so.5 => /zdata/groups/common/nicpa/2015-oct/XeonX5550/hwloc/1.11.1/intel-16.0.0/lib/libhwloc.so.5 (0x00002b8e5751a000)
	libnuma.so.1 => /zdata/groups/common/nicpa/2015-oct/generic/numactl/2.0.10/lib/libnuma.so.1 (0x00002b8e5776b000)
	libpciaccess.so.0 => /usr/lib64/libpciaccess.so.0 (0x000000307ae00000)
	libxml2.so.2 => /zdata/groups/common/nicpa/2015-oct/XeonX5550/libxml2/2.9.2/intel-16.0.0/lib/libxml2.so.2 (0x00002b8e57977000)
	libz.so.1 => /zdata/groups/common/nicpa/2015-oct/XeonX5550/zlib/1.2.8/intel-16.0.0/lib/libz.so.1 (0x00002b8e57e37000)
	libimf.so => /opt/intel/2016/compilers_and_libraries_2016.0.109/linux/compiler/lib/intel64/libimf.so (0x00002b8e58057000)
	libsvml.so => /opt/intel/2016/compilers_and_libraries_2016.0.109/linux/compiler/lib/intel64/libsvml.so (0x00002b8e58550000)
	libirng.so => /opt/intel/2016/compilers_and_libraries_2016.0.109/linux/compiler/lib/intel64/libirng.so (0x00002b8e5940f000)
	libintlc.so.5 => /opt/intel/2016/compilers_and_libraries_2016.0.109/linux/compiler/lib/intel64/libintlc.so.5 (0x00002b8e59617000)
	libifport.so.5 => /opt/intel/2016/compilers_and_libraries_2016.0.109/linux/compiler/lib/intel64/libifport.so.5 (0x00002b8e59877000)
	libirc.so => /opt/intel/2016/compilers_and_libraries_2016.0.109/linux/compiler/lib/intel64/libirc.so (0x00002b8e59aa6000)
	libifcoremt.so.5 => /opt/intel/2016/compilers_and_libraries_2016.0.109/linux/compiler/lib/intel64/libifcoremt.so.5 (0x00002b8e59d06000)

All libraries are accounted for and the only difference are the env vars provided by intel in the 15 vs. 16 version and the compiled mpif90. The linking step of the executable successfully finishes. It is only upon execution this occurs.

 

0 Kudos
TimP
Honored Contributor III
2,199 Views

Each rank opens a new shell. If the library paths aren't consistent with your compiler update, this makes trouble.

0 Kudos
Nick_Papior
Beginner
2,199 Views

I know, as said, I follow the exact same steps using the intel 15 and intel 16 compiler and hence I do not expect the library path to be the culprit (even doing <exec> shows the error, mpirun is not required).

Again, every symbol is accounted for. Every library has the same symbol table using intel 15 AND intel 16 (and also gnu 5.2.0). When running `ldd` on the executable every library is accounted for and no library is "missing".

However, intel 16 complains at run-time. 

Let me stress, the code need not have any usage of MPI, whatsoever. This below code shows the error (along with any other code)

program e
 integer i 
 i = 0
 print *,i
end program

Have intel 16 changed behaviour regarding symbol lookups?

 

 

 

0 Kudos
Lorri_M_Intel
Employee
2,200 Views

LD_LIBRARY_PATH is used to find shared libraries at *image activation* time, not at link time.

So, it would be relevant to a problem that presented itself at runtime, but not at link time.

I see one thing that's interesting.  In the other post, you'd said that you'd found the undefined symbol in libmpi.so:

libmpi.so
00000000002fb160 B mpi_fortran_argv_null
00000000002fb080 B mpi_fortran_argv_null_

 

Which libmpi.so was that?  Is it one from this directory:

     /zdata/groups/common/nicpa/2015-oct/XeonX5550/openmpi/1.10.1/intel-16.0.0/lib

or from this directory?

     /opt/intel/2016//compilers_and_libraries_2016.0.109/linux/mpi/intel64/lib

My concern is that, at link time, the libmpi.so from the ' /opt/intel/2016//compilers_and_libraries_2016.0.109/linux/mpi/intel64/lib' directory is being found instead of the libmpi.so from the ' /zdata/groups/common/nicpa/2015-oct/XeonX5550/openmpi/1.10.1/intel-16.0.0/lib' directory, and that the wrong libmpi.so is pulling in our libmpi.so.12 instead of the libmpi.so in the Open-MPI directory.

Did that make sense?

To answer your other question, no, Intel Fortran has not changed its handling of symbols between 15  and 16.

It is, however, new to have Intel MPI installed by default, so I'm wondering if its a configuration-type problem and thus resolveable.

Note please, that your ldd output requests the Intel MPI not the OpenMPI library.   Again, leading me to believe it's a configuration problem.

             --Lorri

 

 

 

0 Kudos
Nick_Papior
Beginner
2,199 Views

Oh, how embarrassing. I should totally have seen the libmpi.so against intel and not openmpi.

I just assumed that the intel shipped the library paths similarly as the older versions. In my opinion this is too aggressive for the end user. Well, thanks for the help!

0 Kudos
Lorri_M_Intel
Employee
2,199 Views

Nick Papior A. wrote:

 

I just assumed that the intel shipped the library paths similarly as the older versions. In my opinion this is too aggressive for the end user. Well, thanks for the help!

Just to confirm - is your application working now?

In the 2016 product there was a major shift in the directory structure, as you've probably seen now.   This was done for a number of reasons, involving several products.

And finally, I'm not sure what you're referring to with this statement:

               In my opinion this is too aggressive for the end user

The directory structure change?   Something else?

I'd like to understand what you think should be changed/improved so we can share that internally.   Always interested in improving!

                 thanks -

                                          --Lorri

 

 

0 Kudos
Nick_Papior
Beginner
2,199 Views

Thanks Lorri,

1) Yes the application works. It was the intel libmpi linking that was blocking the application.

2) I think that adding MPI to the LD path, PATH and MIC path should be a choice of the user. Now if one uses intel-mpi this poses no problems, but as soon as you want to use your own MPI (openmpi, mvapich, mpich, etc) with intel compiler you have a cluttered env var. :(

3) In my opinion the mpi-compiler is a separate compiler (yes this is not entirely true, but you catch my drift ;) ) and should be treated as such. 

I can see that this may require users to have 2 source statements, one for the intel compiler and one for the mpi. However you could abstract this by only sourcing the mpi which require the intel compiler, so I think it is only a matter of choosing the intel (icc, ifort) compiler vs. the intel-mpi (icc, ifort, mpicc, mpifort) compiler.

0 Kudos
Reply