I am a PhD student, and our lab is working to parallelize our finite element code using OpenMP. There are certain loops in which the calculations are not very computationally expensive, so these parallel loops only have a time savings if the number of iterations is very large; at fewer iterations, the parallel code is actually much slower than the sequential.
We are wondering if there is a way to circumvent some of the overhead associated with the parallel DO loops. For example, are there any statements that can be made only once at the beginning of the simulation, rather than every time the parallel loop is encountered? It is my understanding that this is not the case, but we would like to explore all possibilities to improve the run-time of the code.
Thank you for any information you can provide.
I am not sure is the best forum section to ask that.
However, even if I don't do Fortran but much more C/C++, I am almost sure that the compiler create threads one forever for the first time you use them. Regarding what you are asking for, you can copy some variable for each thread with "threadprivate".
And more generally you can divide the parallel section of the loop: e.g. in C with "parallel" and then "for" (or "do" for Fortran). I let you have a look to master and single too.
If you can do another job after, without waiting result, you are maybe interesting by "nowait"
More information is available on the OpenMP documentation. However you can start by summary cards: