Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29267 Discussions

Run time erros for fortran coarrays

calcaneo
Débutant
3 877 Visites

We have successfully installed the Fortran compiler and are able do compile code. The problem is we cannot seem to run de code on more than 15 processors. Our server has 56 processors available to us, but the code hangs if we try to run on more than 15.

 

Does any one have an idea what could we be doing wrong?

 

Thanks in advance.

0 Compliments
1 Solution
Barbara_P_Intel
Employé
3 733 Visites

Can you please upgrade to the current compiler release? ifort is now at 2021.6.0. The Fortran compilers are available as part of the oneAPI HPC Toolkit. You can download it here.

I suspect there are bug fixes in the last 4 years that may impact your issue.

BTW, MPIR_CVAR_CH4_OFI_ENABLE_RMA=0 only impacts distributed coarrays using IB.

 

Voir la solution dans l'envoi d'origine

0 Compliments
19 Réponses
MWind2
Nouveau contributeur III
3 867 Visites

What server OS?

 

0 Compliments
calcaneo
Débutant
3 835 Visites

Oooops! Thank you for your patience!

We are running debian 11 ( bullseye) .

The code compiles fine and will run on up to (but not including) 16 processors

0 Compliments
Steve_Lionel
Contributeur émérite III
3 848 Visites

Your title says errors, but the text says hangs. Could it be that your application isn't coded in such a way to scale beyond 15 images? What happens if you try a simple program such as this?

program caftest
print *, "Hello from image ", this_image()
end

0 Compliments
calcaneo
Débutant
3 835 Visites

The code works fine with gfortran on up to 28 processors, it is a simple program such as de the one you mention.

The code compiles fine and will run on up to (but not including) 16 processors

Thanks for your help!

 

0 Compliments
jimdempseyatthecove
Contributeur émérite III
3 823 Visites

>>Our server has 56 processors available to us

>>The code works fine with gfortran on up to 28 processors

Does this mean gfortran fails using 29 or more (logical) processors?

 

Jim Dempsey

0 Compliments
jimdempseyatthecove
Contributeur émérite III
3 822 Visites

Also, it wouldn't hurt to run Steve's test program. If that works, then you "simple program" has an issue with the code.

Conversely, if Steve's program hangs (16 or above logical processors), then the issue is elsewhere.

 

Note, coarrays is implemented using MPI. The system manager can (and often do) restrict the number of processes an application can use. And this may differ between different vendors versions of MPI. If your code is written (as an example) to expect 16 processes, however the system supplies 15 processes, then poorly written code might hang.

 

Jim Dempsey

0 Compliments
Steve_Lionel
Contributeur émérite III
3 809 Visites

You still haven't said what exactly goes wrong. If there is an error message, please show us the complete and exact text.

0 Compliments
calcaneo
Débutant
3 785 Visites

Thank you so much for your time on this subject.

 

This is our code:

!
!! test.f90
!!
!! Made by (Carlos Calcaneo Roldan)
!! Login <calcaneo@acf01>
!!
!! Started on Mon Aug 1 12:42:15 2022 Carlos Calcaneo Roldan
!! Last update Time-stamp: <01-ago-2022 12:42:43 calcaneo>
!

program caftest
print *, "Hello from image ", this_image()
end program caftest

 

 

An this is how we compile:

ifort -coarray=distributed -coarray-num-images=8 test.f90 -o test  (eg for 8 processors).

I am attaching the result, when we use more than 15 processors the program does not respond ans we have to make a hard break.

Thank you very much for your time.

 

 

0 Compliments
jimdempseyatthecove
Contributeur émérite III
3 781 Visites

Try setting the (an) environment variable I_MPI_DEBUG=5

Then run the program (with more than 15 processes).

 

Jim Dempsey

0 Compliments
calcaneo
Débutant
3 751 Visites
0 Compliments
Barbara_P_Intel
Employé
3 774 Visites

What version of the Intel Fortran compiler are you using?

If you are running on a single server, you can use -coarray=shared. Does that work for more than 15 processes?

What MPI fabric are you using? There is a known bug with OFI/mlx over IB and using distributed coarrays . As a workaround, try setting this environment variable:  MPIR_CVAR_CH4_OFI_ENABLE_RMA=0. Another workaround is to use OFI/psm3.

 

 

0 Compliments
calcaneo
Débutant
3 750 Visites
0 Compliments
calcaneo
Débutant
3 740 Visites

Sorry, I forgot to mention, we are using Intel parallel studio 2017

0 Compliments
Barbara_P_Intel
Employé
3 734 Visites

Can you please upgrade to the current compiler release? ifort is now at 2021.6.0. The Fortran compilers are available as part of the oneAPI HPC Toolkit. You can download it here.

I suspect there are bug fixes in the last 4 years that may impact your issue.

BTW, MPIR_CVAR_CH4_OFI_ENABLE_RMA=0 only impacts distributed coarrays using IB.

 

0 Compliments
calcaneo
Débutant
3 730 Visites

Thank you so much Barbara!

 

We can know play with this compiler!!! We have succeeded in installing and compiling the "hello world" code, so now the work begins! (please see image)

 

I cannot express how much I appreciate your time, you just helps us immensely.

 

Hope you have a wonderful day! Screenshot from 2022-08-02 10-49-21.png

0 Compliments
as14
Débutant
3 188 Visites

Hi,

 

Thanks for mentioning the fix for the coarray fortran MLX over IB bug - I am currently trying to do this and tried both bug fixes you recommended but I still cannot get it working. I am using intel-oneapi-compilers/2022.0.2 and intel-oneapi-mpi/2021.4.0. 

UCX version 1.12.1 shows the following transports available:

# Transport: posix
# Transport: sysv
# Transport: self
# Transport: tcp
# Transport: tcp
# Transport: tcp
# Transport: rc_verbs
# Transport: rc_mlx5
# Transport: dc_mlx5
# Transport: ud_verbs
# Transport: ud_mlx5
# Transport: cma

However, when I set export I_MPI_OFI_PROVIDER=mlx I don't get anywhere. Do you know of any other fixes for using distributed coarrays over mlx?

Thanks!

0 Compliments
Barbara_P_Intel
Employé
3 133 Visites

Can you please install the latest compilers that are part of oneAPI 2023.0 that was released in December 2023? Then compile and run again.

 

0 Compliments
Barbara_P_Intel
Employé
3 727 Visites

GOOD NEWS!!  But please use ifort for now. It looks like you might have used ifx.

Be aware that ifx has limited co-array support; we sneaked it in there. With the next release co-array support is planned to be complete and official. See this article for information about the Fortran and OpenMP implementations in ifx available today.

0 Compliments
calcaneo
Débutant
3 720 Visites

Ooops. Thanks for the heads up the reality is that we are exploring still. But now ate least we know the compiler is working.

 

Thanks again!

 

0 Compliments
Répondre