Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Zhanghong_T_
Beginner
852 Views

problem when multiple MPI versions installed

Dear all,

I have a problem to launch processes when multiple MPI versions installed. The processes work before I installed latest MPI 5.0.3.048:

C:\Program Files (x86)\Intel\MPI\4.1.3.047>mpiexec -wdir "Z:\test" -mapall -hosts 10 n01 6 n02 6 n03 6 n04 6 n05 6 n06 6 n07 6 n08 6 n09 6 n10 6 Z:\test

However, after I installed MPI 5.0.3.048, the following errors displayed when I launch mpiexec in the environment of 4.1.3.047:

Aborting: unable to connect to N01, smpd version mismatch

 

I have already run the following command in the environment of 5.0.3.048 before launching mpiexec:

hydra_service -stop

 

Could anyone help me to take a look at it? Is it possible to let both versions work in the cluster?

 

Thanks,

Zhanghong Tang

 

0 Kudos
9 Replies
Zhanghong_T_
Beginner
852 Views

Dear all,

I tested the latest MPI 5.0.3.048 on two calculate nodes and the following errors displayed. Everything works on MPI 4.1.3.047 before installing the latest version. Could you please help me to take a look at the problem?

[mpiexec@N01] ..\hydra\pm\pmiserv\pmiserv_cb.c (773): connection to proxy 0 at host n01 failed
[mpiexec@N01] ..\hydra\tools\demux\demux_select.c (100): callback returned error status
[mpiexec@N01] ..\hydra\pm\pmiserv\pmiserv_pmci.c (501): error waiting for event
[mpiexec@N01] ..\hydra\ui\mpich\mpiexec.c (1059): process manager error waiting for completion

Thanks,

Zhanghong Tang

Artem_R_Intel1
Employee
852 Views

Hi,

There're some changes in process managers in recent Intel MPI Library versions.

2 process managers (PM) are available:
Hydra
SMPD

Intel MPI Library 4.1.3.047
Hydra is experimental.
SMPD:
smpd.exe (service)
mpiexec.exe (launcher)
Hydra:
hydra_service.exe (service)
mpiexec.hydra.exe (launcher)

Intel MPI Library 5.0.3.048:
SMPD is deprecated.
SMPD:
smpd.exe (service)
mpiexec.smpd.exe (launcher)
Hydra:
hydra_service.exe (service)
mpiexec.exe (launcher)

When you tried to run the MPI application in IMPI 4.x environment with mpiexec.exe, you actually run it via SMPD, but the corresponding service was run from IMPI 5.x.
The MPI services (SMPD, Hydra) are automatically installed within IMPI 5.x installation (they rewrite existing ones - in your case IMPI 4.x services).
Regarding to SMPD there's a version check to avoid runs with mixed MPI versions.
As far as I know it isn't supported to run simultaneously IMPI 4.1.3.047 and 5.0.3.048 services.

You wrote that you performed 'hydra_service -stop'. It looks like that this has caused your next error:
[mpiexec@N01] ..\hydra\pm\pmiserv\pmiserv_cb.c (773): connection to proxy 0 at host n01 failed
[mpiexec@N01] ..\hydra\tools\demux\demux_select.c (100): callback returned error status
[mpiexec@N01] ..\hydra\pm\pmiserv\pmiserv_pmci.c (501): error waiting for event
[mpiexec@N01] ..\hydra\ui\mpich\mpiexec.c (1059): process manager error waiting for completion

When you run mpiexec.exe in IMPI 5.x environment it's run via Hydra PM.

 

Zhanghong_T_
Beginner
852 Views

Dear Dr. Artem R.,

Thank you very much for your kindly reply. Then how to let the program run successfully? Should I uninstall the MPI 4.1.3.047 from every calculate node? Is it possible to stop some services of MPI 4.x to let the program run in MPI 5.x or stop some services of MPI 5.x to let the program run in MPI 4.x? I tried to do in command line by

smpd -stop

but it doesn't work.

Thanks,

Zhanghong Tang

Zhanghong_T_
Beginner
852 Views

Dear Dr. Artem R.,

Just now I tried to stop smpd and hydra_service services in MPI 4.x command window, but I noticed that the services smpd and hydra_service of MPI 5.x are stopped in fact, because I tried to run

smpd -start

hydra_service -start

in MPI 5.x, it didn't display that the service is already started.

I test the program in two calculate nodes, and the errors are still the same as the second posts.

Thanks,

Zhanghong Tang

Zhanghong_T_
Beginner
852 Views

Dear Dr. Artem R.,

Another problem/bug: when I running

smpd -version

in MPI 4.x, it shows the version is 4.1.3, however, when running in MPI 5.x, it shows the version is 3.1.2, is there any problem?

Thanks

Zhanghong_T_
Beginner
852 Views

Dear Dr. Artem R.,

I am sorry to bother you again.

I removed all MPI versions and then installed MPI 5.0.3.048 on two calculate nodes again, and test the program, the errors are still the same:

C:\Program Files (x86)\Intel\MPI\5.0.3.048>Z:\directional\for_debug\killfem.bat
[mpiexec@N01] ..\hydra\pm\pmiserv\pmiserv_cb.c (773): connection to proxy 0 at host n01 failed
[mpiexec@N01] ..\hydra\tools\demux\demux_select.c (100): callback returned error status
[mpiexec@N01] ..\hydra\pm\pmiserv\pmiserv_pmci.c (501): error waiting for event
[mpiexec@N01] ..\hydra\ui\mpich\mpiexec.c (1059): process manager error waiting for completion

 

Could you please help me to take a look at it?

Thanks

Zhanghong_T_
Beginner
852 Views

Dear Dr. Artem R.,

After removed MPI 5.0.3.048 and reinstalled MPI 4.1.3.047 on these two calculate nodes, and then test the program again, the program runs OK. When compiling program, I used related libraries (i.e., the impimt.lib and  impicxx.lib from different MPI folders), and then copied related dynamic linked library impimt.dll to the execute folder.

Is there anything I missed?

Thanks.

Artem_R_Intel1
Employee
852 Views

Hi Tang,

Regarding to:
"Then how to let the program run successfully? Should I uninstall the MPI 4.1.3.047 from every calculate node? Is it possible to stop some services of MPI 4.x to let the program run in MPI 5.x or stop some services of MPI 5.x to let the program run in MPI 4.x? I tried to do in command line by
smpd -stop
but it doesn't work."

To switch between Intel MPI 4.x and 5.x versions you should:
1. Set the environment for the desirable version of Intel MPI (for example with mpivars.bat script).

2. Install Hydra/SMPD services from this Intel MPI version:
> hydra_service.exe -install
> smpd.exe -install
It removes the previous versions of the Intel MPI services from the system services.
You can check that the services are from the correct Intel MPI version in the Windows Task Manager / Processes (it's run under SYSTEM user; enable Image Path Name column in the View menu to see the full path of running processes).
'<service_name> -stop/-start' isn't enough to switch between different Intel MPI versions.

3. Use the appropriate Intel MPI launcher:
Intel MPI 4.x:
mpiexec.exe - SMPD
mpiexec.hydra.exe - Hydra

Intel MPI 5.x:
mpiexec.smpd.exe - SMPD
mpiexec.exe - Hydra

Regarding to SMPD/Hydra versions - there're some changes in its versioning.
Actually this version is for internal purposes and aligned with MPICH one.

With regard to the error:
[mpiexec@N01] ..\hydra\pm\pmiserv\pmiserv_cb.c (773): connection to proxy 0 at host n01 failed
[mpiexec@N01] ..\hydra\tools\demux\demux_select.c (100): callback returned error status
[mpiexec@N01] ..\hydra\pm\pmiserv\pmiserv_pmci.c (501): error waiting for event
[mpiexec@N01] ..\hydra\ui\mpich\mpiexec.c (1059): process manager error waiting for completion

Could you please provide full MPI command line and specific environment variables (if any).

Zhanghong_T_
Beginner
852 Views

Dear all,

Are there any changes to use latest version of MPI? Recently I used latest version of MPI and found the errors are the same as I posted at the beginning. The program runs OK when I used MPI 4.1.3.047 under the following command line:

mpiexec -delegate -wdir "Z:\tang\mytest" -mapall -hosts 10 n01 12 n02 12 n03 12 n04 12 n05 12 n06 12 n07 12 n08 12 n09 12 n10 12 "Z:\tang\fem"

 

Is there any detailed information on how to upgrade from MPI 4.1.3.047 to Version 2017 Update 3 Build 20170405?

Thanks

Reply