I would like to understand better the underlying model for coarrays, from a performance standpoint, to make informed architectural decisions for a large analytical code. This code requires a very large amount of data to be accessed across coarray images - the order of magnitude is in the 10s or 100s of GB of data per coarray image.This data is organized within a derived type on each coarray image.
If the derived type is declared as a coarray itself, does it mean that a 'copy' of it is created in a 'buffer' (accessible by all other images), and that synchronization of the entire 'buffer copy' with the local data occurs after each execution segment, if any of the derived type data is changed? Say, if only a scalar component of the derived type is modified by a remote image... is the entire coarray flagged as needing updating at the next synchronization point? This would be very inefficient and time consuming - I hope my understanding here is very naive and does not reflect what actually happens.
An other strategy would be to use much smaller coarrays that will serve as temporary communication vectors for the data. When instructed, each image would copy in these smaller coarrays the data needed from the large derived type variable. This is of course more cumbersome to handle. What is the recommended practice?
If anybody knows of documentation that could help here this would be tremendously appreciated!
There isn't a copy of the whole coarray on each image, just that image's part, But when you access a part from another image, that part gets copied across. If your data structures are that large, that's a lot of copying. You want to minimize the amount of data traveling between images. Instead of a single derived type, consider multiple coarrays each with a smaller part of the data.
Thanks Steve - yes I was probably not clear enough; the 'whole' coarray is not present on each image, just the local part of it (which is unique to the image it belongs to). So, your answer is what I suspected - I will need to make sure that I have small coarrays to prevent indesirable copying of the data.
The coarray itself can be large, but you want to keep as much access local to the image as possible. That's the case with any coarray program.The less cross-image transfer and synchronization, the better your performance will be.