Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

coarray fortran

homng
Beginner
1,214 Views
Dear all,

There seems to be a fortran coarray support in Intel Fortran Composer XE 2011. Can someone please confirm whether it supports cluster or not?

Thank you very much.
0 Kudos
14 Replies
Steven_L_Intel1
Employee
1,214 Views
Yes, it does (on Linux), if you have an Intel Custer Toolkit license as well. Please read the release notes for details.
0 Kudos
homng
Beginner
1,214 Views
great! thanks!
0 Kudos
homng
Beginner
1,214 Views
I am trying to evaluate coarray fortran in a cluster. I am little confused. Let say I want to use total of 64 cores. I compile a program with the command: ifort -coarray -coarray-num-images=64 test.f90 -o test
So running the program just with ./test is enough or should Iinvokesome mpi commands?
Thank you.
0 Kudos
Steven_L_Intel1
Employee
1,214 Views
If you want it distributed across a cluster, use -coarray=distributed and do not use -coarray-num-images. It will use your established MPI "ring", or you can specify an MPI configuration file as an option. Please read the documentation for information on use of these options.

Yes, you just start the program with ./test.
0 Kudos
homng
Beginner
1,214 Views
thank you so much!
0 Kudos
Steven_L_Intel1
Employee
1,214 Views
Please let us know how it works for you.
0 Kudos
homng
Beginner
1,214 Views
Certainly. For shared memory case it worked very nice. However, for distributed memory model I could not yet really figure out the right command (I think). I tried to search the documents but I didn't find the right information. I have an access to a big cluster. I have installed both intel fortran composer XE and intel cluster toolkit.
First I compile my program with the command: ifort -coarray=distributed sem3dcaf.f90 -o sem3dcaf
Then I have the following lines in my batch file which I submit:
# Number of cores:
#SBATCH --nodes=8 --ntasks-per-node=8
## Set up job environment
source /site/bin/jobsetup
#start mpd:
mpdboot
## Run the program:
../bin/sem3dcaf ../input/test_nproc64_sf2.psem
I am trying to use 64 cores. Program runs fine but it seems to slower than I expect. Am I doing something wrong? I added mpdboot command otherwise it would give me the error that mpd has not started.
I would be grateful for the suggestion.
HNG
0 Kudos
Steven_L_Intel1
Employee
1,214 Views
Our expert on this is out of the office today - I'll ask her to help you here.
0 Kudos
homng
Beginner
1,214 Views
That would be great! thank you.
0 Kudos
homng
Beginner
1,214 Views
I went further to test with 16 cores. First I compile the program with:
ifort -coarray=distributed -coarray-num-images=16 -O3 test.f90 -o test
Now I have a job script which contains following lines:
# Number of cores:
#SBATCH --nodes=2 --ntasks-per-node=8
## Set up job environment
source /site/bin/jobsetup
source ~/intel/bin/compilervars.sh intel64
mpdboot -n 2 -r ssh -f $PBS_NODEFILE
mpdtrace
## Do some work:
../bin/sem3dcaf ../input/slope3d_new_ngll5_leastt_nproc16.psem
mpdtrace command shows 2 nodes correctly. But the program running is utterly slow even for the serial part where I don't need any communication at all. I think there is something wrong, but I don't know what it is. I also have a MPI version of the same code, which runs very fast without any problem with openmpi. I could not correctly run the program without using the option -coarray-num-images=16, because only 8 images were detected during run time! May be there is something wrong.
thanks.
0 Kudos
Lorri_M_Intel
Employee
1,214 Views
Well ... I was going to give you an MPI-based program to compare with, but you beat me to it!

There are two things here.

First, I believe you when you say yousaw a problem with only 8 images being started up. I saw that myself on one of our internal clusters, but then the problem "went away" before I could provide a reproducer to our MPI developers. I was never able to reproduce it again, and assumed that we fixed something internally in the Fortran runtime support. It's quite interesting to me that you saw it too. I'll look at that one again.

Second, about the program being utterly slow. Would you be willing to share your program with us? The best way is through Premier Support, so that we can keep a good "paper" trail.

thank you --

-- Lorri
0 Kudos
homng
Beginner
1,214 Views
Sorry for the delayed response. Thank you for the suggestion. I have submitted an issue via premier support.
Thanks
0 Kudos
Adrian_Tineo
Beginner
1,214 Views
I have been having the same problem with mpd. I could not get my distributed memory coarray example program to run just calling it (like manual states), with mpiexec or with mpiexec.hydra. It works with mpirun though, or through manual load of the mpdboot like homng above. Full listing of terminal commands and outputs is below.
However, I cannot choose a custom placement of the images in the cores. The -perhost or -ppn flags are ignored by mpirun. Is there a way to control the placement of images to cores?
Thank you.
=================== TERMINAL OUTPUT FOLLOWS:
tadrian@eiger201:~/codes/CAF/tests_caf> which ifort
/apps/eiger/Intel-CTK-2011/bin/ifort
tadrian@eiger201:~/codes/CAF/tests_caf> ifort -coarray=distributed -coarray-num-images=12 -o hi_caf_12 hi_caf.f90
tadrian@eiger201:~/codes/CAF/tests_caf> ifort -coarray=distributed -coarray-num-images=24 -o hi_caf_24 hi_caf.f90
=== THIS IS INSIDE AN INTERACTIVE PBS SESSION WITH TWO 12 CORES NODES ALLOCATED, eiger201 and eiger202, HENCE 24 CORES IN TOTAL
tadrian@eiger201:~/codes/CAF/tests_caf> cat $PBS_NODEFILE
eiger201
eiger202
=== JUST CALLING THE EXECUTABLE (AS MANUAL SUGGESTS) WILL GIVE THE mpd ERROR
tadrian@eiger201:~/codes/CAF/tests_caf> ./hi_caf_12
mpiexec_eiger201: cannot connect to local mpd (/tmp/pbs.10805.eiger170/mpd2.console_eiger201_tadrian); possible causes:
1. no mpd is running on this host
2. an mpd is running but was started without a "console" (-n option)
==== mpiexec DOES NOT WORK EITHER
tadrian@eiger201:~/codes/CAF/tests_caf> which mpiexec
/apps/eiger/Intel-CTK-2011/impi/4.0.1.007/bin64/mpiexec
tadrian@eiger201:~/codes/CAF/tests_caf> mpiexec ./hi_caf_12
mpiexec_eiger201: cannot connect to local mpd (/tmp/pbs.10805.eiger170/mpd2.console_eiger201_tadrian); possible causes:
1. no mpd is running on this host
2. an mpd is running but was started without a "console" (-n option)
==== mpiexec.hydra DOES NOT WORK EITHER
tadrian@eiger201:~/codes/CAF/tests_caf> which mpiexec.hydra
/apps/eiger/Intel-CTK-2011/impi/4.0.1.007/bin64/mpiexec.hydra
tadrian@eiger201:~/codes/CAF/tests_caf> mpiexec.hydra ./hi_caf_12
./hi_caf_12: error while loading shared libraries: libicaf.so: cannot open shared object file: No such file or directory
./hi_caf_12: error while loading shared libraries: libicaf.so: cannot open shared object file: No such file or directory
./hi_caf_12: error while loading shared libraries: libicaf.so: cannot open shared object file: No such file or directory
./hi_caf_12: error while loading shared libraries: libicaf.so: cannot open shared object file: No such file or directory
./hi_caf_12: error while loading shared libraries: libicaf.so: cannot open shared object file: No such file or directory
./hi_caf_12: error while loading shared libraries: libicaf.so: cannot open shared object file: No such file or directory
./hi_caf_12: error while loading shared libraries: libicaf.so: cannot open shared object file: No such file or directory
./hi_caf_12: error while loading shared libraries: libicaf.so: cannot open shared object file: No such file or directory
./hi_caf_12: error while loading shared libraries: libicaf.so: cannot open shared object file: No such file or directory
./hi_caf_12: error while loading shared libraries: libicaf.so: cannot open shared object file: No such file or directory
./hi_caf_12: error while loading shared libraries: libicaf.so: cannot open shared object file: No such file or directory
./hi_caf_12: error while loading shared libraries: libicaf.so: cannot open shared object file: No such file or directory
=== mpirun WORKS FOR DEFAULT CONFIG: WITH 12 IMAGES IT FILLS THE FIRST NODE
tadrian@eiger201:~/codes/CAF/tests_caf> mpirun ./hi_caf_12
Running with 12 num images
I am image 1 on host eiger201
I am image 10 on host eiger201
I am image 7 on host eiger201
I am image 6 on host eiger201
I am image 5 on host eiger201
I am image 3 on host eiger201
I am image 9 on host eiger201
I am image 11 on host eiger201
I am image 8 on host eiger201
I am image 12 on host eiger201
I am image 2 on host eiger201
I am image 4 on host eiger201
=== mpirun WORKS FOR DEFAULT CONFIG: WITH 24 IMAGES IT FILLS THE TWO NODES
tadrian@eiger201:~/codes/CAF/tests_caf> mpirun ./hi_caf_24
I am image 3 on host eiger201
I am image 4 on host eiger201
I am image 7 on host eiger201
I am image 5 on host eiger201
I am image 8 on host eiger201
I am image 6 on host eiger201
I am image 2 on host eiger201
I am image 9 on host eiger201
I am image 10 on host eiger201
I am image 11 on host eiger201
Running with 24 num images
I am image 1 on host eiger201
I am image 12 on host eiger201
I am image 24 on host eiger202
I am image 20 on host eiger202
I am image 23 on host eiger202
I am image 15 on host eiger202
I am image 22 on host eiger202
I am image 18 on host eiger202
I am image 13 on host eiger202
I am image 19 on host eiger202
I am image 21 on host eiger202
I am image 17 on host eiger202
I am image 16 on host eiger202
I am image 14 on host eiger202
==== PLACEMENT FLAG -ppn IS IGNORED THOUGH. HERE ALL IMAGES ARE BOUND TO THE FIRST NODE'S CORES
tadrian@eiger201:~/codes/CAF/tests_caf> mpirun -ppn 6 ./hi_caf_12
Running with 12 num images
I am image 1 on host eiger201
I am image 8 on host eiger201
I am image 4 on host eiger201
I am image 10 on host eiger201
I am image 5 on host eiger201
I am image 12 on host eiger201
I am image 7 on host eiger201
I am image 6 on host eiger201
I am image 2 on host eiger201
I am image 11 on host eiger201
I am image 9 on host eiger201
I am image 3 on host eiger201
0 Kudos
pbkenned1
Employee
1,214 Views

See this articlefor a method to compile and run a distributed memory coarry program with process pinning to specific nodes/node processors.
Patrick Kennedy
Intel Developer Support

0 Kudos
Reply