Auto-parallelizing of code is decreasing model performance on a duo-core machine?

dennistrolle · ‎10-12-2009

Hi all,

I am running a computationally intensive coupled hydrodynamics and ecological model, and trying to set up simulations that can utilize a multi-core machine. I use IVF v.11.1 to compile the model on a windows XP machine.

Using the IVF code optimization (using compiler options: /O3 /QxHost /Qprec-div-) I can greatly improve performance (about 50% decrease in simulation time relative to pre-optimization simulations). To further improve model performance I have tried using the IVFs auto-parallelization (using compiler options: /O3 /QxHost /Qparallel /Qprec-div- but I have also tried: /O2 /QxHost /Qparallel /Qprec-div- ).

When I ran test-simulations with the parallelized code on a duo-core machine (Intel Core2 Duo CPU, P8400, 2.26GHz, 3GB RAM) the model performance rather unexpectedly worsened (computation time using the parallelized code increased with 40% relative to simulations with non-parallelized code). Eventually, I would like to run simulations on a multi-core supercomputer, but if the model is not able to utilize several cores there is no point.

I have tried looking at the diagnostic information reported by the auto-parallelizer (using option: /Qpar-report3) and it seems that at least part of the code is successfully parallelized and I would expect that parallelization would at least to some degree speed-up simulations (but this is not the case).

Has anyone got experience with the IVF auto-parallelizer maybe I need to revise my compiler options furtherto enhance model performance on a multi-core machine?

Any help or hints will be much appreciated.

Many thanks

Dennis

PS: I have been searching through the forum pages (and documentation) for help, and can see that I could possibly use OpenMP (which I havent got any experience with, e.g., http://software.intel.com/en-us/forums/showthread.php?t=59590) to enhance model performance but the code Im using is very long and rather messy and Im afraid that I could waste months by trying to implement OpenMP directives.