<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Using -iface eth0 sets the in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974709#M3358</link>
    <description>&lt;P&gt;Using -iface eth0 sets the interface to be used for launching the ranks, not the communication fabric to be used by MPI.&lt;/P&gt;
&lt;P&gt;I_MPI_FABRICS needs to be set wherever you are launching the job.&amp;nbsp; Hydra will read this before launching and use it when launching the ranks.&lt;/P&gt;</description>
    <pubDate>Thu, 15 Aug 2013 17:07:56 GMT</pubDate>
    <dc:creator>James_T_Intel</dc:creator>
    <dc:date>2013-08-15T17:07:56Z</dc:date>
    <item>
      <title>Running parallel job on compute nodes with IB HCA from a master node "NOT" having IB HCA</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974704#M3353</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;Is it possible to run task on compute nodes having InfiniBand HCA from a master node that lacks IB HCA using Torque/Grid Engine?&lt;/P&gt;
&lt;P&gt;Please guide if it is possible.&lt;/P&gt;
&lt;P&gt;Intel MPI 4.1.1.036 is installed on all cluster machines.&lt;/P&gt;
&lt;P&gt;The network configuration is as follows:&lt;/P&gt;
&lt;P&gt;Master Node: (2xXeon E5-2450/96GB/CentOS 6.2/NFS Services over Ethernet) - 1 No.&lt;/P&gt;
&lt;P&gt;Compute Nodes: (2xXeon E5-2450/96GB/TrueScale QDR Dual-port QLE7342/CentOS 6.2/NFS Client over GbE) - 4 Nos.&lt;/P&gt;
&lt;P&gt;IP Addresses (GbE) : Master Node: 10.0.0.221 - Hostname: mnode; Compute Nodes: 10.0.0.222 .. 225 - Hostnames: c00 .. c03&lt;/P&gt;
&lt;P&gt;IP Address (ib0): Master Node: N/A; Compute Nodes: 192.168.10.222 .. 225 - /etc/hosts -&amp;gt; c00-ib; c01-ib; c02-ib; c03-ib&lt;/P&gt;
&lt;P&gt;Additionally, if mpiexec.hydra can be used, then what is the command-line from master node to directly run without Torque or Grid Engine.&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Regards&lt;/P&gt;
&lt;P&gt;Girish Nair &amp;lt;girishnairisonline at gmail dot com&amp;gt;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Aug 2013 05:18:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974704#M3353</guid>
      <dc:creator>Girish_Nair</dc:creator>
      <dc:date>2013-08-13T05:18:24Z</dc:date>
    </item>
    <item>
      <title>Hi Girish,</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974705#M3354</link>
      <description>&lt;P&gt;Hi Girish,&lt;/P&gt;
&lt;P&gt;Running outside of the job scheduler will depend on your system's policy.&amp;nbsp; If it is setup to allow it, then there should be no problem using mpiexec.hydra (or mpirun, which will default to mpiexec.hydra) to run.&amp;nbsp; Simply specify your hosts and ranks as you normally would.&amp;nbsp; For some additional information, see the article &lt;A href="http://software.intel.com/en-us/articles/controlling-process-placement-with-the-intel-mpi-library"&gt;Controlling Process Placement with the Intel® MPI Library&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;If you are going to use InfiniBand* for your job nodes, but are launching from a system without IB, you will need to specify the network&amp;nbsp;interface using either the -iface command line option or the I_MPI_HYDRA_IFACE environment variable.&amp;nbsp; You'll likely want to use eth0, but this can vary depending on your system configuration.&lt;/P&gt;
&lt;P&gt;Also, do not use the IB host names to start your job.&amp;nbsp; Hydra will attempt to connect via ssh first, which needs to happen through the standard IP channel.&amp;nbsp; It will handle switching to the IB fabric for your job.&amp;nbsp; If you want to verify that it correctly launched using IB, run with I_MPI_DEBUG=2 to get fabric selection information.&lt;/P&gt;
&lt;P&gt;Sincerely,&lt;BR /&gt; James Tullos&lt;BR /&gt; Technical Consulting Engineer&lt;BR /&gt; Intel® Cluster Tools&lt;/P&gt;</description>
      <pubDate>Wed, 14 Aug 2013 20:08:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974705#M3354</guid>
      <dc:creator>James_T_Intel</dc:creator>
      <dc:date>2013-08-14T20:08:39Z</dc:date>
    </item>
    <item>
      <title>Hi James,</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974706#M3355</link>
      <description>Hi James,
Thanks for your response.

Please correct me if I'm wrongly understood your statement.:

    The machinefile would have the entries like:

    n01:16 # hostname resolving to eth0 address
    n02:16 # hostname resolving to eth0 address
    n03:16 # hostname resolving to eth0 address
    n04:16 # hostname resolving to eth0 address

    while running from the master node not having an IB hardware:

    mpiexec.hydra -np 16 -machinefile ./machine.cluster -iface ib0 ./main.out

I apologize if this is too much to ask. Great if an example is provided.

Additionally, does the same command line accept the following alongwith the above command:
    mpiexec.hydra ...    -genv I_MPI_FABRICS shm:tmi  ...

as my understanding is the shm:dapl is default, and I've found that shm:tmi gives me the best performance over IB. The master node obviously will not have /etc/dat.conf file, since it lacks IB HCA.

My advance thanks for your expert advise.

Regards
Girish Nair &lt;GIRISHNAIRISONLINE at="" gmail="" dot="" com=""&gt;&lt;/GIRISHNAIRISONLINE&gt;</description>
      <pubDate>Thu, 15 Aug 2013 13:41:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974706#M3355</guid>
      <dc:creator>Girish_Nair</dc:creator>
      <dc:date>2013-08-15T13:41:05Z</dc:date>
    </item>
    <item>
      <title>If tmi is always better than</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974707#M3356</link>
      <description>&lt;P&gt;If tmi is always better than DAPL for you, you can set I_MPI_FABRICS=shm:tmi in your environment rather than having to pass it every time.&amp;nbsp; As for launching, unless you have an interface named ib0 on your master node, you'll want to use:&lt;/P&gt;
&lt;P&gt;[plain]mpirun -n 16 -machinefile ./machine.cluster -iface eth0 ./main.out[/plain]&lt;/P&gt;
&lt;P&gt;The machinefile you have is correct.&amp;nbsp; Now, keep in mind, if you run this job with the machinefile you have, all 16 ranks will run on n01.&amp;nbsp; For more flexibility, I would use a hostfile instead.&lt;/P&gt;
&lt;P&gt;[plain]$cat hostfile&lt;/P&gt;
&lt;P&gt;n01&lt;/P&gt;
&lt;P&gt;n02&lt;/P&gt;
&lt;P&gt;n03&lt;/P&gt;
&lt;P&gt;n04[/plain]&lt;/P&gt;
&lt;P&gt;And run with&lt;/P&gt;
&lt;P&gt;[plain]mpirun -n &amp;lt;nranks&amp;gt; -ppn &amp;lt;ranks per node&amp;gt; -f hostfile ./main.out[/plain]&lt;/P&gt;
&lt;P&gt;This will run a total of &amp;lt;nranks&amp;gt; ranks, with &amp;lt;ranks per node&amp;gt; ranks placed on each of the nodes.&amp;nbsp; So, if I wanted to run 16 ranks, with 4 per node, that would be&lt;/P&gt;
&lt;P&gt;[plain]mpirun -n 16 -ppn 4 -f hostfile ./main.out[/plain]&lt;/P&gt;
&lt;P&gt;This gives more flexibility in process placement.The article I linked shows several other options, and I'll add more information about the hostfile capability.&lt;/P&gt;</description>
      <pubDate>Thu, 15 Aug 2013 13:52:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974707#M3356</guid>
      <dc:creator>James_T_Intel</dc:creator>
      <dc:date>2013-08-15T13:52:54Z</dc:date>
    </item>
    <item>
      <title>Hi James,</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974708#M3357</link>
      <description>Hi James,
Ah, that was a quick response from you. Thanks.

Please read my mpiexec.hydra command as -np 64 and not -np 16.

2 quick queries:
a) If -iface eth0 is used, would the job be run on IB on Compute Nodes?
b) Can the environment variable I_MPI_FABRICS be set on Master Node that lacks IB HCA hardware? If no, then should it be set on all Compute Nodes with IB HCA?

Thanks
Girish Nair &lt;GIRISHNAIRISONLINE at="" gmail="" dot="" com=""&gt;&lt;/GIRISHNAIRISONLINE&gt;</description>
      <pubDate>Thu, 15 Aug 2013 14:21:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974708#M3357</guid>
      <dc:creator>Girish_Nair</dc:creator>
      <dc:date>2013-08-15T14:21:26Z</dc:date>
    </item>
    <item>
      <title>Using -iface eth0 sets the</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974709#M3358</link>
      <description>&lt;P&gt;Using -iface eth0 sets the interface to be used for launching the ranks, not the communication fabric to be used by MPI.&lt;/P&gt;
&lt;P&gt;I_MPI_FABRICS needs to be set wherever you are launching the job.&amp;nbsp; Hydra will read this before launching and use it when launching the ranks.&lt;/P&gt;</description>
      <pubDate>Thu, 15 Aug 2013 17:07:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974709#M3358</guid>
      <dc:creator>James_T_Intel</dc:creator>
      <dc:date>2013-08-15T17:07:56Z</dc:date>
    </item>
    <item>
      <title>Thanks a ton James.</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974710#M3359</link>
      <description>Thanks a ton James.

You've effectively cleared all my doubts on this. I'll wait for your notes on hostfile capability whenever you publish it.

Thanks once again.
~Girish Nair</description>
      <pubDate>Fri, 16 Aug 2013 01:48:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974710#M3359</guid>
      <dc:creator>Girish_Nair</dc:creator>
      <dc:date>2013-08-16T01:48:12Z</dc:date>
    </item>
    <item>
      <title>The article was updated</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974711#M3360</link>
      <description>&lt;P&gt;The article was updated yesterday, if you can't see the updates, please let me know.&lt;/P&gt;</description>
      <pubDate>Fri, 16 Aug 2013 13:13:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974711#M3360</guid>
      <dc:creator>James_T_Intel</dc:creator>
      <dc:date>2013-08-16T13:13:45Z</dc:date>
    </item>
    <item>
      <title>Thank you very much James.</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974712#M3361</link>
      <description>Thank you very much James.</description>
      <pubDate>Sat, 17 Aug 2013 05:16:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974712#M3361</guid>
      <dc:creator>Girish_Nair</dc:creator>
      <dc:date>2013-08-17T05:16:19Z</dc:date>
    </item>
    <item>
      <title>Hi James,</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974713#M3362</link>
      <description>&lt;P&gt;Hi James,&lt;/P&gt;

&lt;P&gt;Following up with this thread, referencing to your article&amp;nbsp;http://software.intel.com/en-us/articles/controlling-process-placement-with-the-intel-mpi-library, and IntelMPI5.0 Linux Reference Manual, &amp;nbsp;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;there are three configuration ways to launch MPMD cluster:&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;-&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;hostfile, &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;-machinefile,&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;-configfile. &amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;What are the difference between -hostfile and -machinefile ? &lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Can we use per rank HCA binding in -machinefile or -hostfile (as it can be done with single HCA for MVAPICH mpihydra hostfile) ?&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;We want to control HCA per rank, possibly multiple HCAs per rank.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;I believe machinefile/hostfile and configfile can be used in a single launch cmd. I would very much appreciate some references with detailed examples and explanations on how are all these three used interchangeably?&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Thanks,&lt;BR /&gt;
	Mous.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Mar 2015 19:14:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Running-parallel-job-on-compute-nodes-with-IB-HCA-from-a-master/m-p/974713#M3362</guid>
      <dc:creator>Mous_Tatarkhanov</dc:creator>
      <dc:date>2015-03-10T19:14:33Z</dc:date>
    </item>
  </channel>
</rss>

