- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've successfully used a coarray library with
shared memory and ifort 12. Now I've got
licence for distributed memory and ifort 14,
but my coarray library doesn't work anymore.
I can compile and run standalone coarray code
with distributed memory. However, I'm not clear
how to use the coarray-config-file with the library.
A typical scenario: I have coarray library code
under ~/project/lib and coarray code using the
library under ~/project/tests.
I compile under ~/project/lib with
ifort -c -coarray=distributed -debug full -free -fPIC -warn all
I put the resulting module files under ~/project/modules.
I put the resulting object files into a unix archive and under
~/project/libs.
Then under ~/project/tests I build and link the
coarray code using the library code like this:
ifort -c z.f90 -coarray=distributed -I~/project/modules -coarray-config-file=ca.conf -debug full -warn all
ifort -o z.x z.o -coarray=distribued -L~/project/libs -l
I get an executable, but when I run it with this ca.conf:
-envall -n 64 ./z.x 4 4
and with an appropriate PBS script, I get various runtime
errors, e.g.:
rank 0 in job 1 node32-034_45144 caused collective abort of all ranks
exit status of rank 0: killed by signal 9
I will investigate the code, of course, but just wanted
to check that I'm using the logic of -coarray-config-file correctly
for building/linking coarray library code.
Thanks
Anton
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In your description, the "link" step has "distributed" spelled incorrectly; I assume that's a transcription error.
That aside - your basic syntax looks fine.
The configuration file is an MPI configuration file; it's not a secret that we use MPI as the underlying transport mechanism. If you suspect the problem is in the underlying transport, you can use the I_MPI_DEBUG environment variable to help get more info.
This configuration file does not look like it "distributes". That it, I might have expected -hostX options.
What happens if you use the configuration file, but shared memory? Does your program still fail?
--Lorri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the confirmation.
I think this is the same problem as in my PR:
http://software.intel.com/en-us/forums/topic/489052
What happens is that because remote reads take
so long, the program exceeds the queue allocation.
The error message then reflects that.
I have no problems with shared memory runs.
In fact I only got access to the distributed memory
licence this week, so these are my first attempts
to move from shared to distributed memory, and
from ifort 12 to 14.
I'll try I_MPI_DEBUG though.
Thanks
Anton

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page