- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We're experiencing ~10 second startup overhead with `mpiexec -localonly -np 8` on Windows for local (single-machine) execution.
With `mpiexec -localonly -np 1` it takes 1 second
Current config: set I_MPI_OFI=0 && mpiexec -localonly -np 8 MODL.exe
We tested various optimizations (ASYNC_LAUNCH, PIN_DOMAIN=off, etc.) but they all made performance worse.
Questions:
- Are there Windows-specific optimizations for local-only execution?
- Is ~10 second MPI overhead normal on Windows?
Details: Migrated from MS-MPI to Intel MPI. No cluster, no network, no CUDA needed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your feedback. We investigated this further and found the root cause on our side: our company antivirus was scanning each MPI process at startup.
This explains the behavior very well:
- With 1 process, startup is around 1 second.
- With 8 processes, startup increases to around 10 seconds.
- The overhead scales with the number of launched processes.
So in our case, the delay is not caused by Intel MPI itself, but by security software overhead on Windows during process creation.
We also found that signing the binaries with a trusted certificate completely resolved the issue.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What version of Intel MPI are you using?
Can you try setting I_MPI_FABRICS=shm and possibly passing "-bootstrap fork" option to mpiexec?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your feedback. We investigated this further and found the root cause on our side: our company antivirus was scanning each MPI process at startup.
This explains the behavior very well:
- With 1 process, startup is around 1 second.
- With 8 processes, startup increases to around 10 seconds.
- The overhead scales with the number of launched processes.
So in our case, the delay is not caused by Intel MPI itself, but by security software overhead on Windows during process creation.
We also found that signing the binaries with a trusted certificate completely resolved the issue.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page