Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
1987 Discussions

Symbol lookup error while running a program

HaoranZhou
Beginner
749 Views

Hi all,

I've compiled a program in linux using Intel Visual Fortran and mkl. When I ran the program on the cluster directly, I got the following error:

'./FSSICAS20220211: symbol lookup error: /usr/lib/x86_64-linux-gnu/libmpi_mpifh.so.40: undefined symbol: mpi_conversion_fn_null_'

However, when I run the program through ssh, no such error occurs. How could this happen? And how to deal with this error to make the program successfully run in our cluster?

Thanks in advance!

 

0 Kudos
9 Replies
VarshaS_Intel
Moderator
723 Views

Hi,

 

Thanks for reaching out to us.

 

Could you please let us know the Intel oneAPI version and the MPI Library(along with the version) you are using?

 

If you are using the Intel MPI Library, could you please initialize the Intel MPI environment and let us know if you are still facing issues?

Please find the below command to initialize the Intel MPI Library environment:

source /opt/intel/oneapi/mpi/2021.5.0/env/vars.sh

Could you please provide us with the debug information by using the below command?

I_MPI_DEBUG=30 FI_LOG_LEVEL=debug mpirun -n <no-of-proc> -ppn <proc-per-node> ./a.out

Also, could you please provide us with the sample reproducer code along with the steps to reproduce your issue from our end to investigate more on your issue?

 

>>However, when I run the program through ssh, no such error occurs.

Could you please elaborate more on this statement?

How did you run the program successfully through ssh? Could you please provide the commands to try at our end?

 

Thanks & Regards,

Varsha

 

HaoranZhou
Beginner
708 Views

Hi Varsha,

Sorry for the late reply.

The Intel oneAPI version and the MPI Library(along with the version) are 2022.0.1. Usually, I initialize the Intel MPI environment with the following command:

source /opt/intel/oneapi/setvars.sh intel64

and I get the information in the uploaded figure. It seems that the Intel MPI environment has been initialized but the same issue happens. Even though I initialize the Intel MPI environment using the command you suggested, there is still the issue, please see the uploaded figure.

As for the debug information, I don't know what is the meaning of <proc-per-node>, could you please tell me what I should input for <proc-per-node> to collect the debug information?

Since the program is an in-house solver right now, I'm sorry that we cannot provide the code.

 

On ssh, I ran the program successfully with the same commands:

source /opt/intel/oneapi/setvars.sh intel64
ulimit -s unlimited
ulimit -d unlimited
ulimit -m unlimited
./program

 

Best

Haoran


VarshaS_Intel
Moderator
700 Views

Hi,

 

Apologies for the inconvenience caused to you.

 

>>could you please tell me what I should input for <proc-per-node> to collect the debug information?

The option -ppn(processes per node) is used when we need to divide the number of processes that we want to run on a particular node.

The -ppn value needs to be an integer greater than 0 and ppn value needs to be less than or equal to n(number of processes).

 

Could you please find the below link for more information?

https://www.intel.com/content/www/us/en/develop/documentation/mpi-developer-guide-linux/top/running-...

 

Could you please find the below command and screenshot to generate the debug information at our end we tried with sample helloworld code? Please provide us with the debug file after generating it for your code.

I_MPI_DEBUG=30 FI_LOG_LEVEL=debug mpirun -n 2 -ppn 2 ./hello(can also use ./a.out) &> hellodebug.txt(can use any name)

mpidebug.png

>>Since the program is an in-house solver right now, I'm sorry that we cannot provide the code.

You have a choice to send your source code by private message. So if you are willing to send it, please do let us know so that we can contact you privately.

 

Thanks & Regards,

Varsha

 

HaoranZhou
Beginner
688 Views

Hi,

The debug file has been uploaded in the attachment.

I discussed with my collegue who did most of the coding and found the current program was using openmp rather than Intel MPI. So I'm sorry for bothering you so long about Intel MPI. At present, we compiled the program with a Makefile and use '-qopenmp' to adopt openmp in Intel Fortran:

FC = ifort
SOURCE = xxx.f90 ...
OBJS = $(SOURCE:.f90=.o)
TARGET = FSSICAS20220211

$(TARGET):$(OBJS)
  $(FC) -qopenmp -qmkl -o $(TARGET) UserDefined_SoilModel.so UserDefined_BoundaryValue.so $(OBJS) -L$LD_LIBRARY_PATH -lprecice
$(OBJS):$(SOURCE)
  $(FC) -O -c $(SOURCE)

 

By the way, I saw a topic which was similar to my problem. May it be the problem that our program mix the fortran library libmpi_mpifh.so with the library libmpi.so ?

 

Best

Haoran

VarshaS_Intel
Moderator
670 Views

Hi,


Thanks for providing the information.


Could you please confirm which MPI library(along with the version) you are using?


Could you please provide us with the cluster configuration/details, how the nodes are connected(example: Ethernet) and what commands you are using to launch the multiple nodes?


Also, could you please confirm that by using the debug command(I_MPI_DEBUG=30 FI_LOG_LEVEL=debug mpirun -n 2 -ppn 2 ./hello(can also use ./a.out) &> hellodebug.txt(can use any name)) is this the only debug information you got generated? If not, could you please share the complete debug log?


>>Since the program is an in-house solver right now, I'm sorry that we cannot provide the code.

You have a choice to send your source code by private message. So if you are willing to send it, please do let us know so that we can contact you privately.


Thanks & Regards,

Varsha


HaoranZhou
Beginner
662 Views

Hi,

>>Could you please confirm which MPI library(along with the version) you are using?

If I'm not mistaken, the MPI library we are using is libmpi.so.40. The following figure is the output when I use 'ldd /usr/lib/x86_64-linux-gnu/libmpi_mpifh.so.40'. I guess it may not be the problem of library since the program can be run through ssh.

ldd_libmpi_mpifh.png

>>Could you please provide us with the cluster configuration/details, how the nodes are connected(example: Ethernet) and what commands you are using to launch the multiple nodes?

The cluster we are using is Kunlen9016 produced by Huawei. There are 256 cores and 512G memory in total. I'm sorry I don't know how the nodes are connected.

 

>>Also, could you please confirm that by using the debug command(I_MPI_DEBUG=30 FI_LOG_LEVEL=debug mpirun -n 2 -ppn 2 ./hello(can also use ./a.out) &> hellodebug.txt(can use any name)) is this the only debug information you got generated? If not, could you please share the complete debug log?

Yes, that is the only debug information I got.

 

>>You have a choice to send your source code by private message. So if you are willing to send it, please do let us know so that we can contact you privately.

I'm sorry that I do not have the right to send it, hope you could understand.

 

Best,

Haoran

VarshaS_Intel
Moderator
628 Views

Hi,

 

Thanks for sharing the information.

 

Sorry for the inconvenience caused to you. These forums are intended to support the queries related to Intel Products and we observed that your issue is related to OpenMPI. Since the error you are getting is not specific to the Intel MPI library we cannot assist you further. 

 

However, if you wish to build and run your application using Intel MPI, we would be happy to assist you with any issues. Please use the below command to initialize the Intel MPI environment.

source /opt/intel/oneapi/mpi/2021.5.0/env/vars.sh

 

Thanks & Regards,

Varsha

 

HaoranZhou
Beginner
564 Views

Hi Varsha,

Sorry for my late reply and I'm sorry for bothering you so long.

Yes, the problem is due to OpenMPI and I finally spent some time to solve it.

Thanks again for your generous help!

Best regards,

Haoran

VarshaS_Intel
Moderator
502 Views

Hi Haoran,


Glad to know that your issue is resolved. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.


Have a Good Day!


Thanks & Regards,

Varsha


Reply