Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2154 Discussions

Symbol lookup error while running a program

HaoranZhou
Beginner
2,336 Views

Hi all,

I've compiled a program in linux using Intel Visual Fortran and mkl. When I ran the program on the cluster directly, I got the following error:

'./FSSICAS20220211: symbol lookup error: /usr/lib/x86_64-linux-gnu/libmpi_mpifh.so.40: undefined symbol: mpi_conversion_fn_null_'

However, when I run the program through ssh, no such error occurs. How could this happen? And how to deal with this error to make the program successfully run in our cluster?

Thanks in advance!

 

0 Kudos
9 Replies
VarshaS_Intel
Moderator
2,310 Views

Hi,

 

Thanks for reaching out to us.

 

Could you please let us know the Intel oneAPI version and the MPI Library(along with the version) you are using?

 

If you are using the Intel MPI Library, could you please initialize the Intel MPI environment and let us know if you are still facing issues?

Please find the below command to initialize the Intel MPI Library environment:

source /opt/intel/oneapi/mpi/2021.5.0/env/vars.sh

Could you please provide us with the debug information by using the below command?

I_MPI_DEBUG=30 FI_LOG_LEVEL=debug mpirun -n <no-of-proc> -ppn <proc-per-node> ./a.out

Also, could you please provide us with the sample reproducer code along with the steps to reproduce your issue from our end to investigate more on your issue?

 

>>However, when I run the program through ssh, no such error occurs.

Could you please elaborate more on this statement?

How did you run the program successfully through ssh? Could you please provide the commands to try at our end?

 

Thanks & Regards,

Varsha

 

0 Kudos
HaoranZhou
Beginner
2,295 Views

Hi Varsha,

Sorry for the late reply.

The Intel oneAPI version and the MPI Library(along with the version) are 2022.0.1. Usually, I initialize the Intel MPI environment with the following command:

source /opt/intel/oneapi/setvars.sh intel64

and I get the information in the uploaded figure. It seems that the Intel MPI environment has been initialized but the same issue happens. Even though I initialize the Intel MPI environment using the command you suggested, there is still the issue, please see the uploaded figure.

As for the debug information, I don't know what is the meaning of <proc-per-node>, could you please tell me what I should input for <proc-per-node> to collect the debug information?

Since the program is an in-house solver right now, I'm sorry that we cannot provide the code.

 

On ssh, I ran the program successfully with the same commands:

source /opt/intel/oneapi/setvars.sh intel64
ulimit -s unlimited
ulimit -d unlimited
ulimit -m unlimited
./program

 

Best

Haoran


0 Kudos
VarshaS_Intel
Moderator
2,287 Views

Hi,

 

Apologies for the inconvenience caused to you.

 

>>could you please tell me what I should input for <proc-per-node> to collect the debug information?

The option -ppn(processes per node) is used when we need to divide the number of processes that we want to run on a particular node.

The -ppn value needs to be an integer greater than 0 and ppn value needs to be less than or equal to n(number of processes).

 

Could you please find the below link for more information?

https://www.intel.com/content/www/us/en/develop/documentation/mpi-developer-guide-linux/top/running-applications/running-an-mpi-program.html

 

Could you please find the below command and screenshot to generate the debug information at our end we tried with sample helloworld code? Please provide us with the debug file after generating it for your code.

I_MPI_DEBUG=30 FI_LOG_LEVEL=debug mpirun -n 2 -ppn 2 ./hello(can also use ./a.out) &> hellodebug.txt(can use any name)

mpidebug.png

>>Since the program is an in-house solver right now, I'm sorry that we cannot provide the code.

You have a choice to send your source code by private message. So if you are willing to send it, please do let us know so that we can contact you privately.

 

Thanks & Regards,

Varsha

 

0 Kudos
HaoranZhou
Beginner
2,275 Views

Hi,

The debug file has been uploaded in the attachment.

I discussed with my collegue who did most of the coding and found the current program was using openmp rather than Intel MPI. So I'm sorry for bothering you so long about Intel MPI. At present, we compiled the program with a Makefile and use '-qopenmp' to adopt openmp in Intel Fortran:

FC = ifort
SOURCE = xxx.f90 ...
OBJS = $(SOURCE:.f90=.o)
TARGET = FSSICAS20220211

$(TARGET):$(OBJS)
  $(FC) -qopenmp -qmkl -o $(TARGET) UserDefined_SoilModel.so UserDefined_BoundaryValue.so $(OBJS) -L$LD_LIBRARY_PATH -lprecice
$(OBJS):$(SOURCE)
  $(FC) -O -c $(SOURCE)

 

By the way, I saw a topic which was similar to my problem. May it be the problem that our program mix the fortran library libmpi_mpifh.so with the library libmpi.so ?

 

Best

Haoran

0 Kudos
VarshaS_Intel
Moderator
2,257 Views

Hi,


Thanks for providing the information.


Could you please confirm which MPI library(along with the version) you are using?


Could you please provide us with the cluster configuration/details, how the nodes are connected(example: Ethernet) and what commands you are using to launch the multiple nodes?


Also, could you please confirm that by using the debug command(I_MPI_DEBUG=30 FI_LOG_LEVEL=debug mpirun -n 2 -ppn 2 ./hello(can also use ./a.out) &> hellodebug.txt(can use any name)) is this the only debug information you got generated? If not, could you please share the complete debug log?


>>Since the program is an in-house solver right now, I'm sorry that we cannot provide the code.

You have a choice to send your source code by private message. So if you are willing to send it, please do let us know so that we can contact you privately.


Thanks & Regards,

Varsha


0 Kudos
HaoranZhou
Beginner
2,249 Views

Hi,

>>Could you please confirm which MPI library(along with the version) you are using?

If I'm not mistaken, the MPI library we are using is libmpi.so.40. The following figure is the output when I use 'ldd /usr/lib/x86_64-linux-gnu/libmpi_mpifh.so.40'. I guess it may not be the problem of library since the program can be run through ssh.

ldd_libmpi_mpifh.png

>>Could you please provide us with the cluster configuration/details, how the nodes are connected(example: Ethernet) and what commands you are using to launch the multiple nodes?

The cluster we are using is Kunlen9016 produced by Huawei. There are 256 cores and 512G memory in total. I'm sorry I don't know how the nodes are connected.

 

>>Also, could you please confirm that by using the debug command(I_MPI_DEBUG=30 FI_LOG_LEVEL=debug mpirun -n 2 -ppn 2 ./hello(can also use ./a.out) &> hellodebug.txt(can use any name)) is this the only debug information you got generated? If not, could you please share the complete debug log?

Yes, that is the only debug information I got.

 

>>You have a choice to send your source code by private message. So if you are willing to send it, please do let us know so that we can contact you privately.

I'm sorry that I do not have the right to send it, hope you could understand.

 

Best,

Haoran

0 Kudos
VarshaS_Intel
Moderator
2,215 Views

Hi,

 

Thanks for sharing the information.

 

Sorry for the inconvenience caused to you. These forums are intended to support the queries related to Intel Products and we observed that your issue is related to OpenMPI. Since the error you are getting is not specific to the Intel MPI library we cannot assist you further. 

 

However, if you wish to build and run your application using Intel MPI, we would be happy to assist you with any issues. Please use the below command to initialize the Intel MPI environment.

source /opt/intel/oneapi/mpi/2021.5.0/env/vars.sh

 

Thanks & Regards,

Varsha

 

0 Kudos
HaoranZhou
Beginner
2,151 Views

Hi Varsha,

Sorry for my late reply and I'm sorry for bothering you so long.

Yes, the problem is due to OpenMPI and I finally spent some time to solve it.

Thanks again for your generous help!

Best regards,

Haoran

0 Kudos
VarshaS_Intel
Moderator
2,089 Views

Hi Haoran,


Glad to know that your issue is resolved. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.


Have a Good Day!


Thanks & Regards,

Varsha


0 Kudos
Reply