Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

Issue migrating to Intel MPI

jburri
Beginner
1,794 Views

I manage a legacy code that has been built with the Intel compiler along with the MPI/Pro library for years, but in the last couple of years we have been trying to convert from MPI/Pro to Intel MPI.  To date, we have tried to migrate 3 times using 3 different versions of Intel MPI and very time we have hit a different roadblock.  I am trying again and have hit yet another roadblock and I have run out of ideas as to how to resolve it.  The code appears to compile fine, but when I run it I get the following runtime error:

/home/<username>/src/rtpmain: symbol lookup error: /home/<username>/src/rtpmain: undefined symbol: DftiCreateDescriptor_s_md

This error occurs the first time a FFT is performed. I built and ran the code on RHEL5 and everything about the code is the same except for the MPI library and the only changes that were made was how the code was built and submitted to the scheduler (PBS Pro).

Since there was an unresolved symbol, I was thinking the environment wasn't setup correctly, but I including a "ldd" of the executable within the submit script to make sure the environment on the executing node was setup correct and everything looks fine:

libdl.so.2 => /lib64/libdl.so.2
libmkl_intel_lp64.so => /opt/intel/composer_xe_2011_sp1.6.233/mkl/lib/intel64/libmkl_intel_lp64.so
libmkl_core.so => /opt/intel/composer_xe_2011_sp1.6.233/mkl/lib/intel64/libmkl_core.so
libmkl_sequential.so => /opt/intel/composer_xe_2011_sp1.6.233/mkl/lib/intel64/libmkl_sequential.so
libpthread.so.0 => /lib64/libpthread.so.0
libm.so.6 => /lib64/libm.so.6
librt.so.1 => /lib64/librt.so.1
libmpi_mt.so.4 => /opt/intel/impi/4.0.3.008/lib64/libmpi_mt.so.4
libmpigf.so.4 => /opt/intel/impi/4.0.3.008/lib64/libmpigf.so.4
libgcc_s.so.1 => /lib64/libgcc_s.so.1

Since the only thing to change in the code was the MPI library and the actual error has nothing to do with MPI (we are using the sequential version MKL), I thought the issue might have something to do with mpicc and what it is passing to the compiler/linker.  Here is the output from the make:

/opt/intel/impi/4.0.3.008/bin64/mpicc -cc=/opt/intel/composer_xe_2011_sp1.6.233/bin/intel64/icc -mt_mpi -echo -I../include -I../libs/vlib/include -I../libs/util/include -I/usr/local/hdf5-1.8.10/64_intel121_threadsafe_include -I/opt/intel/impi/4.0.4.008/include64 -I/opt/intel/composer_xe_2011_sp1.6.233/mkl/include -O3 -ip -axSSE4.2 -mssse3 -D_GNU_SOURCE -D H5_USE_16_API -L/opt/intel/composer_xe_2011_sp1.6.233/mkl/lib/intel64 -L/opt/intel/impi/4.0.3.008/lib64 -o rtpmain rtpmain.c srcFile1.o … srcFileN.o ../shared/shared.a ../libs/util/lib/libvec.a ../libs/util/lib/libutil.a -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lrt -lm

Using the mpicc -echo option, here is what mpicc adds (see sections in bold) to the build process:

/opt/intel/composer_xe_2011_sp1.6.233/bin/intel64/icc -ldl -ldl -ldl -ldl  -I../include -I../libs/vlib/include -I../libs/util/include -I/usr/local/hdf5-1.8.10/64_intel121_threadsafe_include -I/opt/intel/impi/4.0.4.008/include64 -I/opt/intel/composer_xe_2011_sp1.6.233/mkl/include -O3 -ip -axSSE4.2 -mssse3 -D_GNU_SOURCE -D H5_USE_16_API -L/opt/intel/composer_xe_2011_sp1.6.233/mkl/lib/intel64 -L/opt/intel/impi/4.0.3.008/lib64 -o rtpmain rtpmain.c srcFile1.o … srcFileN.o ../shared/shared.a ../libs/vlib/lib/libvec.a ../libs/util/lib/libutil.a -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lrt -lm -I/opt/intel/impi/4.0.3.008/intel43/include -L/opt/intel/impi/4.0.3.008/intel64/lib -Xlinker -enalbe-new-dtags -Xlinker -rpath –Xlinker /opt/intel/impi/4.0.3.008/intel64/lib -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/4.0.3 -lmpi_mt -lmpigf -lmpigi -lpthread -lpthreap -lpthreath -lpthread -lrt

As a comparison, here is what the MPI/Pro mpicc script adds to the build process:

/opt/intel/composer_xe_2011_sp1.6.233/bin/intel64/icc -I../include -I../libs/vlib/include -I../libs/util/include -I/usr/local/hdf5-1.8.10/64_intel121_threadsafe_include -I/usr/local/mpipro-2.2.0-rh4-64/include -I/opt/intel/composer_xe_2011_sp1.6.233/mkl/include -O3 -ip -axSSE4.2 -mssse3 -D_GNU_SOURCE -D H5_USE_16_API -L/opt/intel/composer_xe_2011_sp1.6.233/mkl/lib/intel64 -L/usr/local/mpipro-2.2.0-rh4-64/lib64 -o rtpmain rtpmain.c srcFile1.o … srcFileN.o ../shared/shared.a ../libs/vlib/lib/libvec.a ../libs/util/lib/libutil.a -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lrt  -I/usr/local/mpipro-2.2.0-rh4-64/include -L/usr/local/mpipro-2.2.0-rh4-64/lib64 -lmpipro -lpthread –lm

I have made numerous changes to this build process including the bypassing mpicc and adding various compiler/linker options, but nothing has made any difference.  At this point I am at a loss as to what to do next.  Does anyone have any ideas that I can try?

Any feedback you can provide would be greatly appreciated.

0 Kudos
2 Replies
James_T_Intel
Moderator
1,794 Views

How are you launching your program?  When you ran ldd, did you run it in a job on a node, or directly on the head node?  If you haven't already, try running ldd the same way you are launching your program, and see if the linkage is correct in the job.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

0 Kudos
jburri
Beginner
1,794 Views

I am launching the program using mpirun within a script that is submitted to PBS.  The ldd command is within the submit script and is run when the job goes into execution on the execution node.  

0 Kudos
Reply