Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2166 Discussions

Question on using INTEL MPI with hosts

SrikanthYaddanapudi
1,178 Views

Hi,

 

We've recently upgraded to now available Intel MPI 2021.11.0 (from 2018) and are trying to run our executable/program on multiple hosts using mpi. There are a few questions for which we'd like to get your feedback to be able to move forward.

 

1) Our hosts might not have an explicit installation of MPI, but the mpi required binaries/dlls/license are stored in specific folders on these hosts. These might not necessarily be at the same location on all the hosts. Considering this, how do I setup the environment for each of these hosts to specify the location of mpiexec and all the relevant files to ensure it could run on these hosts. Do we use -env or -genv or a separate config/hosts file with specified location of mpiexec/mpi binaries per host?

 

2) For remote cluster runs with hosts, we normally store our executable at a shared location, that is accessible to all the hosts. This was working fine previously with Intel MPI 2018, but after the upgrade to Intel MPI 2021.11.0, we are getting the below error. Access was denied from a host for the executable and creation of the process failed.  

 

SrikanthYaddanapudi_0-1702338207165.png

 

This was our recommended approach with our clients, where the executable/program was stored in a shared folder, and they didn't need to have the executable/installation of our program on all the hosts and mpiexec was still able to work. But now, this isn't working, at least by default. Could you please let me know, how we could use the shared folder with multiple hosts.

 

3) On a similar line, the executable to be executed also could be at different locations on these hosts. We noticed, if the executable is at the same location on all the hosts, we would not need to provide any additional information, but if the executable is at different locations on the hosts, it looks like we need to provide more information. Could you please suggest how to pass this information for every user.

 

Regards,

Srikanth

 

 

0 Kudos
5 Replies
RabiyaSK_Intel
Moderator
1,145 Views

Hi,


Thanks for posting in Intel Communities.


We have notified your questions/ concerns to the concerned development team. We will get back to you soon.


Thanks & Regards,

Shaik Rabiya


0 Kudos
SrikanthYaddanapudi
1,131 Views

Thank you, Shaik Rabiya, we'll look forward to the feedback.

 

Regards,

Srikanth

0 Kudos
RabiyaSK_Intel
Moderator
1,099 Views

Hi,


Thank you for your patience. 


Could you please confirm if you have changed your user authentication?


Please go through the below documentation which has information regarding it:

https://www.intel.com/content/www/us/en/docs/mpi-library/developer-guide-windows/2021-11/user-authorization.html


After checking if your user authentication is compliant with Intel's could you please follow these steps and share your findings:


1. Install and enable the winrm service (you can check if it is enabled with "get-service winrm" in PowerShell) if you don't have it on your nodes


2. Add all relevant nodes to the list of TrustedHosts e.g. in PowerShell with Administrator permissions (i.e. run as administrator) 


          Set-Item WSMan:\localhost\Client\TrustedHosts -Value "<comma-separated list of hosts>"


3. Test the visibility of the hosts in PowerShell with

Invoke-Command -ComputerName <comma-separated list of hosts> -ScriptBlock {hostname}


If the command above fails, it will probably tell you what is missing. You will have to follow these instructions. Once the command above works, you can proceed to the next item


4. Call setvars.bat from Intel MPI from cmd.exe not from PowerShell and test it from cmd.exe with

           

mpiexec -ppn 1 -hosts <comma-separated list of hosts> hostname


Thanks & Regards,

Shaik Rabiya


0 Kudos
SrikanthYaddanapudi
1,072 Views

Hello Shaik Rabiya,

 

I've tried the steps listed above, by adding trust on the Nodes from master and vice-versa through power shell.  Our executable/program is stored on the master Node and is shared to be accessible by the worker Nodes. Even after this step, if I try to run the program in the cluster mode, with remote Nodes, I still get the error "Access Denied", as shown below

 

SrikanthYaddanapudi_0-1702652521325.png

 

We then installed the program on all the Nodes, along with the master at the same path and this way we were successfully able to start the program in cluster mode, but when we tried to open a program file, that is in a shared folder, accessible to all the Nodes, we again ran into the same issue of "Access Denied".

 

As it happens, we didn't change our user authentication behavior, but this is an upgrade from Intel 2018.x to 2021.11. The access of shared folders and files was working by default with the prior version, and we didn't run into any of these issues. It's been a few years since we upgraded and we are possibly noticing all the issues, because there could have been significant changes over the last few years.

 

Regards,

Srikanth

0 Kudos
RabiyaSK_Intel
Moderator
946 Views

Hi,


We highly recommend you to change the user authentication and notify your findings after that.


Thanks & Regards,

Shaik Rabiya


0 Kudos
Reply