I have a question about how to perform ScaLAPACK operations in parallel. To be more specific, I want several MPI subcommunicators to call ScaLAPACK routines independently of each other, including the routine BLACS_GRIDMAP (which is used to create a context for parallel execution).
The problem I have encountered is that BLACS_GRIDMAP is always globally blocking with respect to MPI_COMM_WORLD so that the subcommunicators do not run independently (they hang until all processes have called BLACS_GRIDMAP, perhaps forever). Is there a possibility to avoid the global blocking of BLACS_GRIDMAP?
Thank you for your help in advance!
Edit: I think I should clarify. I want to call blacs_gridmap independently because it is not clear apriori how big the matrices will be that the subcommunicators should work on (e.g., diagonalize) independently. And it is my understanding that the optimal grid Nr x Nc differ depending on the matrix size. On the other hand, I guess I could also estimate the maximum matrix size somehow at the beginning and let blacs_gridmap produce a context that I would use for all ScaLAPACK calls. Then, I would not need to call blacs_gridmap independently. The disadvantage would be that Nr and Nc might not always be optimal for a given problem...