- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
We have successfully installed the Fortran compiler and are able do compile code. The problem is we cannot seem to run de code on more than 15 processors. Our server has 56 processors available to us, but the code hangs if we try to run on more than 15.
Does any one have an idea what could we be doing wrong?
Thanks in advance.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Can you please upgrade to the current compiler release? ifort is now at 2021.6.0. The Fortran compilers are available as part of the oneAPI HPC Toolkit. You can download it here.
I suspect there are bug fixes in the last 4 years that may impact your issue.
BTW, MPIR_CVAR_CH4_OFI_ENABLE_RMA=0 only impacts distributed coarrays using IB.
Lien copié
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
What server OS?
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Oooops! Thank you for your patience!
We are running debian 11 ( bullseye) .
The code compiles fine and will run on up to (but not including) 16 processors
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Your title says errors, but the text says hangs. Could it be that your application isn't coded in such a way to scale beyond 15 images? What happens if you try a simple program such as this?
program caftest
print *, "Hello from image ", this_image()
end
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
The code works fine with gfortran on up to 28 processors, it is a simple program such as de the one you mention.
The code compiles fine and will run on up to (but not including) 16 processors
Thanks for your help!
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
>>Our server has 56 processors available to us
>>The code works fine with gfortran on up to 28 processors
Does this mean gfortran fails using 29 or more (logical) processors?
Jim Dempsey
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Also, it wouldn't hurt to run Steve's test program. If that works, then you "simple program" has an issue with the code.
Conversely, if Steve's program hangs (16 or above logical processors), then the issue is elsewhere.
Note, coarrays is implemented using MPI. The system manager can (and often do) restrict the number of processes an application can use. And this may differ between different vendors versions of MPI. If your code is written (as an example) to expect 16 processes, however the system supplies 15 processes, then poorly written code might hang.
Jim Dempsey
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
You still haven't said what exactly goes wrong. If there is an error message, please show us the complete and exact text.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Thank you so much for your time on this subject.
This is our code:
!
!! test.f90
!!
!! Made by (Carlos Calcaneo Roldan)
!! Login <calcaneo@acf01>
!!
!! Started on Mon Aug 1 12:42:15 2022 Carlos Calcaneo Roldan
!! Last update Time-stamp: <01-ago-2022 12:42:43 calcaneo>
!
program caftest
print *, "Hello from image ", this_image()
end program caftest
An this is how we compile:
ifort -coarray=distributed -coarray-num-images=8 test.f90 -o test (eg for 8 processors).
I am attaching the result, when we use more than 15 processors the program does not respond ans we have to make a hard break.
Thank you very much for your time.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Try setting the (an) environment variable I_MPI_DEBUG=5
Then run the program (with more than 15 processes).
Jim Dempsey
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
What version of the Intel Fortran compiler are you using?
If you are running on a single server, you can use -coarray=shared. Does that work for more than 15 processes?
What MPI fabric are you using? There is a known bug with OFI/mlx over IB and using distributed coarrays . As a workaround, try setting this environment variable: MPIR_CVAR_CH4_OFI_ENABLE_RMA=0. Another workaround is to use OFI/psm3.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Sorry, I forgot to mention, we are using Intel parallel studio 2017
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Can you please upgrade to the current compiler release? ifort is now at 2021.6.0. The Fortran compilers are available as part of the oneAPI HPC Toolkit. You can download it here.
I suspect there are bug fixes in the last 4 years that may impact your issue.
BTW, MPIR_CVAR_CH4_OFI_ENABLE_RMA=0 only impacts distributed coarrays using IB.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Thank you so much Barbara!
We can know play with this compiler!!! We have succeeded in installing and compiling the "hello world" code, so now the work begins! (please see image)
I cannot express how much I appreciate your time, you just helps us immensely.
Hope you have a wonderful day!
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Hi,
Thanks for mentioning the fix for the coarray fortran MLX over IB bug - I am currently trying to do this and tried both bug fixes you recommended but I still cannot get it working. I am using intel-oneapi-compilers/2022.0.2 and intel-oneapi-mpi/2021.4.0.
UCX version 1.12.1 shows the following transports available:
# Transport: posix
# Transport: sysv
# Transport: self
# Transport: tcp
# Transport: tcp
# Transport: tcp
# Transport: rc_verbs
# Transport: rc_mlx5
# Transport: dc_mlx5
# Transport: ud_verbs
# Transport: ud_mlx5
# Transport: cma
However, when I set export I_MPI_OFI_PROVIDER=mlx I don't get anywhere. Do you know of any other fixes for using distributed coarrays over mlx?
Thanks!
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Can you please install the latest compilers that are part of oneAPI 2023.0 that was released in December 2023? Then compile and run again.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
GOOD NEWS!! But please use ifort for now. It looks like you might have used ifx.
Be aware that ifx has limited co-array support; we sneaked it in there.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Ooops. Thanks for the heads up the reality is that we are exploring still. But now ate least we know the compiler is working.
Thanks again!

- S'abonner au fil RSS
- Marquer le sujet comme nouveau
- Marquer le sujet comme lu
- Placer ce Sujet en tête de liste pour l'utilisateur actuel
- Marquer
- S'abonner
- Page imprimable