Intel coarray shared vs distributed

DataScientist · ‎01-30-2020

Hi everyone, Can someone with the knowledge of -coarray flag functionality comment on the differences between "shared" and "distributed" keywords for this flag? Based on my personal observations, for example, the "shared" keyword does not seem to create a shared pool memory for the global variables (which is actually great in the problems we have encountered so far). But, what do these two flags do that makes them different? any performance or usage difference, for example?

Steve_Lionel · ‎01-30-2020

"shared" uses multiple processes on a single computer for the various coarray images. "distributed" uses multiple computers connected in a "cluster", one image per computer. The advantage of "shared" is that it has minimal setup requirements and can work well on a multicore system as long as you don't "oversubscribe" the cores. The disadvantage of "shared' is that there's a low upper limit on the number of images that are reasonable. With "distributed" you can have, potentially, thousands of images across thousands of nodes, but configuring this is not simple and the startup time is even longer than for "shared".

Coarrays do not use a shared memory pool. Each image has its own "piece" of a coarray and MPI (in Intel's implementation) is used to send data and synchronization back and forth.