Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2161 Discussions

Re: MPIDI_OFI_handle_cq_error(1042): OFI poll failed

Lumos
Beginner
1,861 Views

Hi,

 

I use the latest mpiicx2024.0.1 to compile hdf5.14.3, when make will have:

icx: warning: argument unused during compilation: '-fno-alias' [-Wunused-command-line-argument]

How to remove this warning. With mpiicx -v, there will be a similar warning: icx: warning: argument unused during compilation: '-i/home/Compiler/intel2024 / oneapi/intelpython3 / include' [- Wunused - the command - line - argument].

 

and make hdf5.14.3 error: 

make[1]: *** [Makefile:1132: t_bigio.o] Error 1
make[1]: Leaving directory '/home/LIBRARIES/intel2024/hdf5-1.14.3/testpar'
make: *** [Makefile:730: all-recursive] Error 1

 

My system is CENTOS7. Do you have any suggestions?

 

My steps:

export CC=mpiicx
export CXX=mpiicpx
export FC=mpiifx
export F90=mpiifx
export F77=mpiifx
export CPPFLAGS=-I/home/Compiler/intel2024/oneapi/mpi/2021.11/include
export LDFLAGS="-L/home/Compiler/intel2024/oneapi/mpi/2021.11/lib"
export CFLAGS="-O3 -fPIC"
export FCFLAGS="-O3 -fPIC"
export CXXFLAGS="-O3 -fPIC"

./configure --prefix=$DIR/hdf51.14.3 --enable-fortran --enable-shared --enable-parallel --with-pic CC=mpiicx FC=mpiifx CXX=mpiicpx CFLAGS="-fPIC -O3 -xHost -fno-alias -align" FFLAGS="-fPIC -O3 -xHost -fno-alias -align" CXXFLAGS="-fPIC -O3 -xHost -fno-alias -align" FFLAGS="-I/home/Compiler/intel2024/oneapi/mpi/2021.11/include -L/home/Compiler/intel2024/oneapi/mpi/2021.11/lib" --with-szlib=$DIR/szip2.1.1 --with-zlib=$DIR/zlib1.3

 

Moreover, I found that the inability to use oneapi2024.0 in the virtual machine Rocky9 system would also cause the system to not boot up and run.

0 Kudos
14 Replies
TobiasK
Moderator
1,836 Views

@Lumos please open a new thread for a new issue.

There seems to be a problem compiling t_bigio.o with 2024.0.2 icx at optimization level -O3. However, the next release fixes that bug.

Please consider using -O2 for this release.

 

"-fno-alias" is still not supported, unfortunately.

 

Also your configure and export defines / overwrites a couple of flags. For your reference, I successfully build hdf5 with those flags:

 

./configure CFLAGS="-fPIC -O2 -xHost -fno-alias -align" FFLAGS="-fPIC -O3 -xHost -fno-alias -align" CXXFLAGS="-fPIC -O3 -xHost -fno-alias -align" CC=mpiicx CXX=mpiicpx F90=mpiifx FC=mpiifx --enable-shared --enable-parallel --enable-fortran --with-zlib=yes --with-szlib=no

make -j

For the issue with Rocky Linux 9.3 in a virtual machine, please post this in the proper forum here:

oneAPI Registration, Download, Licensing and Installation

 

0 Kudos
Lumos
Beginner
1,821 Views

Thank you very much for your timely reply. I will make it according to your flags, but there is a new problem:

 

ptest.F90(19): error #7002: Error in opening the compiled module file. Check INCLUDE paths. [MPI]
USE MPI
------^
ptest.F90(42): warning #8889: Explicit interface or EXTERNAL declaration is required. [MPI_INIT]
CALL mpi_init(mpierror)
-------^
ptest.F90(43): error #6404: This name does not have a type, and must have an explicit type. [MPI_SUCCESS]
IF (mpierror .NE. MPI_SUCCESS) THEN
--------------------^
ptest.F90(46): warning #8889: Explicit interface or EXTERNAL declaration is required. [MPI_COMM_RANK]
CALL mpi_comm_rank( MPI_COMM_WORLD, mpi_rank, mpierror )
-------^
ptest.F90(46): error #6404: This name does not have a type, and must have an explicit type. [MPI_COMM_WORLD]
CALL mpi_comm_rank( MPI_COMM_WORLD, mpi_rank, mpierror )
----------------------^
ptest.F90(50): warning #8889: Explicit interface or EXTERNAL declaration is required. [MPI_COMM_SIZE]
CALL mpi_comm_size( MPI_COMM_WORLD, mpi_size, mpierror )
-------^
ptest.F90(67): warning #8889: Explicit interface or EXTERNAL declaration is required. [HYPER]
CALL hyper(length, do_collective(j), do_chunk(i), mpi_size, mpi_rank, ret_total_error)
-------------^
ptest.F90(78): warning #8889: Explicit interface or EXTERNAL declaration is required. [MULTIPLE_DSET_WRITE]
CALL multiple_dset_write(length, do_collective(1), do_chunk(1), mpi_size, mpi_rank, ret_total_error)
-------^
ptest.F90(87): warning #8889: Explicit interface or EXTERNAL declaration is required. [PMULTIPLE_DSET_HYPER_RW]
CALL pmultiple_dset_hyper_rw(do_collective(j), do_chunk(i), mpi_size, mpi_rank, ret_total_error)
-------------^
ptest.F90(98): warning #8889: Explicit interface or EXTERNAL declaration is required. [MPI_ALLREDUCE]
CALL MPI_ALLREDUCE(total_error, sum, 1, MPI_INTEGER, MPI_SUM, MPI_COMM_WORLD, mpierror)
-------^
ptest.F90(98): error #6404: This name does not have a type, and must have an explicit type. [MPI_INTEGER]
CALL MPI_ALLREDUCE(total_error, sum, 1, MPI_INTEGER, MPI_SUM, MPI_COMM_WORLD, mpierror)
------------------------------------------^
ptest.F90(98): error #6404: This name does not have a type, and must have an explicit type. [MPI_SUM]
CALL MPI_ALLREDUCE(total_error, sum, 1, MPI_INTEGER, MPI_SUM, MPI_COMM_WORLD, mpierror)
-------------------------------------------------------^
ptest.F90(106): warning #8889: Explicit interface or EXTERNAL declaration is required. [MPI_FINALIZE]
CALL mpi_finalize(mpierror)
----------^
ptest.F90(112): warning #8889: Explicit interface or EXTERNAL declaration is required. [MPI_ABORT]
CALL mpi_abort(MPI_COMM_WORLD, 1, mpierror)
----------^
compilation aborted for ptest.F90 (code 1)
make[2]: *** [Makefile:1322: ptest.o] Error 1
make[2]: Leaving directory '/home/LIBRARIES/intel2024/hdf5-1.14.3/fortran/testpar'
make[1]: *** [Makefile:902: all-recursive] Error 1
make[1]: Leaving directory '/home/LIBRARIES/intel2024/hdf5-1.14.3/fortran'
make: *** [Makefile:730: all-recursive] Error 1

 

Do you have any suggestions?

 

Best,

Lumos

0 Kudos
TobiasK
Moderator
1,819 Views

Please do not set the variables via export.

Only use one way, not both. Probably the path to your MPI include directory is wrong, the mpi* compiler wrappers will set them automatically.


0 Kudos
Lumos
Beginner
1,769 Views

Actually, I didn't set the variable by export.

 

My flags:

./configure --prefix=$DIR/hdf51.14.3 CFLAGS="-fPIC -O2 -xHost -fno-alias -align" FFLAGS="-fPIC -O3 -xHost -fno-alias -align" CXXFLAGS="-fPIC -O3 -xHost -fno-alias -align" CC=mpiicx CXX=mpiicpx F90=mpiifx FC=mpiifx --enable-shared --enable-parallel --enable-fortran --with-zlib=yes --with-szlib=no
make


I get the same error, what's going on?

0 Kudos
TobiasK
Moderator
1,762 Views

@Lumos


I just realized that you are using CentOS 7.9 which is not supported anymore.

https://www.intel.com/content/www/us/en/developer/articles/system-requirements/mpi-library-system-requirements.html


The last thing I can advise you to do is to start from scratch, delete the entire folder, extract it again in a clean environment.


0 Kudos
Lumos
Beginner
1,746 Views

Thank you for your help, I do use CentOS7, but I am only a user of the machine and cannot change the system. And I have tried your suggestion, which does not solve my problem.

0 Kudos
Lumos
Beginner
1,734 Views

That's great! I found the problem: I installed intelpython, and I can successfully pass make by removing intelpython from my environment. But in fact, make check made a new mistake:

*** UNEXPECTED RETURN from H5Oget_native_info is -1 at line 5438 in tfile.c
HDF5-DIAG: Error detected in HDF5 (1.14.3) thread 0:
  #000: H5O.c line 1325 in H5Oget_native_info(): invalid location identifier
    major: Invalid arguments to routine
    minor: Inappropriate type
  #001: H5VLint.c line 1741 in H5VL_vol_object(): invalid identifier type to function
    major: Invalid arguments to routine
    minor: Inappropriate type
*** UNEXPECTED VALUE from H5Oget_native_info should be 2, but is 6520800 at line 5439 in tfile.c
HDF5-DIAG: Error detected in HDF5 (1.14.3) thread 0:
  #000: H5O.c line 1325 in H5Oget_native_info(): invalid location identifier
    major: Invalid arguments to ro0.07user 0.09system 0:00.68elapsed 23%CPU (0avgtext+0avgdata 14112maxresident)k
1000inputs+3072outputs (0major+7824minor)pagefaults 0swaps
make[4]: *** [Makefile:3982: testhdf5.chkexe_] Error 1
make[4]: Leaving directory '/LIBRARIES/intel2024/hdf5-1.14.3/test'
make[3]: *** [Makefile:3968: build-check-s] Error 2
make[3]: Leaving directory '/LIBRARIES/intel2024/hdf5-1.14.3/test'
make[2]: *** [Makefile:3962: test] Error 2
make[2]: Leaving directory '/LIBRARIES/intel2024/hdf5-1.14.3/test'
make[1]: *** [Makefile:3401: check-am] Error 2
make[1]: Leaving directory '/LIBRARIES/intel2024/hdf5-1.14.3/test'
make: *** [Makefile:730: check-recursive] Error 1
0 Kudos
TobiasK
Moderator
1,709 Views

@Lumos


I did not check make check


It seems to be related to the "-align" compiler flag that you use. Can you please try again with:


./configure CFLAGS="-fPIC -O3 -xHost" FFLAGS="-fPIC -O3 -xHost" CXXFLAGS="-fPIC -O3 -xHost" CC=mpiicx CXX=mpiicpx F90=mpiifx FC=mpiifx --enable-shared --enable-parallel --enable-fortran --with-zlib=yes --with-szlib=no


Where did you get the compiler flags from?



0 Kudos
Lumos
Beginner
1,691 Views

I tried your suggestion and it seems to solve my problem. My compiler flag is the one I used when I used OneAPI 2022.

0 Kudos
Lumos
Beginner
1,686 Views

What a bummer. make check gets a new error:

Testing  -- Collective I/O with Independent metadata writes (COLLIO_INDMDWR)
===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 0 PID 30288 RUNNING AT pam2
=   KILLED BY SIGNAL: 14 (Alarm clock)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 1 PID 30289 RUNNING AT pam2
=   KILLED BY SIGNAL: 14 (Alarm clock)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 2 PID 30290 RUNNING AT pam2
=   KILLED BY SIGNAL: 14 (Alarm clock)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 3 PID 30291 RUNNING AT pam2
=   KILLED BY SIGNAL: 14 (Alarm clock)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 4 PID 30292 RUNNING AT pam2
=   KILLED BY SIGNAL: 14 (Alarm clock)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 5 PID 30293 RUNNING AT pam2
=   KILLED BY SIGNAL: 14 (Alarm clock)
===================================================================================
64.74user 26.77system 21:38.86elapsed 7%CPU (0avgtext+0avgdata 511028maxresident)k
7143448inputs+1667208outputs (0major+599930minor)pagefaults 0swaps
make[4]: *** [Makefile:1671: testphdf5.chkexe_] Error 1
make[4]: Leaving directory '/LIBRARIES/intel2024/hdf5-1.14.3/testpar'
make[3]: *** [Makefile:1804: build-check-p] Error 1
make[3]: Leaving directory '/LIBRARIES/intel2024/hdf5-1.14.3/testpar'
make[2]: *** [Makefile:1652: test] Error 2
make[2]: Leaving directory '/LIBRARIES/intel2024/hdf5-1.14.3/testpar'
make[1]: *** [Makefile:1393: check-am] Error 2
make[1]: Leaving directory '/LIBRARIES/intel2024/hdf5-1.14.3/testpar'
make: *** [Makefile:730: check-recursive] Error 1
0 Kudos
TobiasK
Moderator
1,674 Views

@Lumos


that seems to be a MPI problem, however, since you are running on an unsupported OS I cannot help you much here.

Maybe the HDF5 group can provide more insights here.


0 Kudos
Lumos
Beginner
1,664 Views

Thank you for your reply.

 

In fact, I have encountered similar problems when using intel oneapi2022. I have asked about it in hdf5 group, but I have not received any reply.

 

make[4]: *** [t_2Gio.chkexe_] Error 1 - HDF5 Library - HDF 论坛 --- Make[4]: *** [t_2Gio.chkexe_] Error 1 - HDF5 Library - HDF Forum (hdfgroup.org)

 

In addition, I have suggested that the administrator of the machine upgrade the system, but this can not be implemented right away, do you have any recommended open source system?

 

0 Kudos
TobiasK
Moderator
1,657 Views

@Lumos

sorry to hear that. As for the OS I would of course try to stick to one of the OS we list in our support matrix.

https://www.intel.com/content/www/us/en/developer/articles/system-requirements/mpi-library-system-requirements.html


0 Kudos
Lumos
Beginner
1,652 Views

Thank you very much for your help and advice. I'll keep trying. In fact, when I used oneapi2022.3 to compile hdf5, although make check did not pass, I successfully make install hdf5, I do not know if there will be any impact after use. In addition, I will tell your suggestion to the administrator of our machine for reference.

Thanks again!

0 Kudos
Reply