Dear support team,
I have a question about a performance difference between Windows 7 SP 1 and RHEL 6.5.
The situation is as follows:
The hardware is a DELL precision rack 7910, see link for exact specification (click on components):
We installed Linux RHEL 6.5 on this machine and ran our product (compiled with Intel C/C++/Fortran 13.1.3 (gcc version 4.4.7 compatibility) and Intel MPI 5.0.2.044 on Linux).
After that, we installed Windows 7 SP 1 on this machine and ran our product (compiled with Intel C/C++/Fortran 18.104.22.168 and Intel MPI 5.0.2.044 on Windows) again.
What we observed is a big performance drop on 1 and 2 cpu on Windows in comparison to Linux. If we go up to 8, 16, 32 cpus we got nearly the same performance on Windows as on Linux, but we got heavy oscillation in computation time only on Windows (sometimes 16 cpus faster than 32 cpu). On Intel MPI 4.1.3.045 we didn't see this oscillation phenomenon.
The question is:
Is there a problem with Intel MPI 5.0.2.044 on Windows 7 SP 1 on the above described hardware?
This machine got two 18 core cpus, is there a problem with the cpu scheduler on Windows which does not respect the pinning or something else?
Many thanks in advance
Thanks for getting in touch. We switched default process managers on Windows with version 5.0.x so I'm wondering if some of the issues stem from that. Starting with Intel MPI 5.0 (vs. 4.1.3), mpiexec.exe how calls Hydra instead of the old SMPDs. Can you verify you're starting the job correctly on your Windows machine with 5.0.2?
Hydra has been default on Linux for a long time.
All the best,
thank you for your reply. We use Intel MPI 5.0.2.044 on Windows 7 Enterprise SP 1 and call our product with:
mpiexec.hydra.exe -np #cpu -delegate -localroot -envall ./path_to_product.exe
The hydra and smpd process managers run as services.
The callstack is then: mpiexec.hydra.exe -> pmi_proxy.exe -> #cpu/product.exe
Same is without -localroot, but this time the callstack is below the hydra_service.exe.
So we use the hydra process manager. (after typing hydra_service.exe -version, it still shows version 3.1.2 instead of 5.0.2)
Do you know something about Microsoft patches around this behavior or additional command line arguments of Intel MPI to fix this?
To recap: Intel MPI 4.1.3.045 has no problems with 1, 2 or 4 cpus, but Intel MPI 5.0.2.044 is dramatically slow in this situation. Going to 8, 16, 32 cores Intel MPI 5.0.2.044 is slightly faster but has strange statistically behavior in computation times.
One more remark: running with 1 cpu, we observed in the Windows process manager a strange change in processor pinning, i.e. the job jumps between certain cores.