<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Infiniband-Intel MPI Performance MM5 in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896941#M2119</link>
    <description>&lt;DIV style="margin:0px;"&gt;Hi jriocaton,&lt;BR /&gt;&lt;BR /&gt;I'd suggest simplifying your mpd.hosts and command lines.&lt;BR /&gt;1) Could you try to remove ifhn=192.168.10.xxx from mpd.hosts file?&lt;/DIV&gt;
2) remove '--ifhn=192.168.10.1' from mpdboot command.&lt;BR /&gt;3) remove '-genv I_MPI_PIN_PROCS 0-7' from mpiexec command line&lt;BR /&gt;4) try to use rdssm istead of rdma in '-env I_MPI_DEVICE rdma'&lt;BR /&gt;&lt;BR /&gt;Of cause you need to kill all mpd process before ('mpdallexit').&lt;BR /&gt;&lt;BR /&gt;Intel MPI can so called fallback technique - to disable it set I_MPI_FALLBACK_DEVICE to 0.&lt;BR /&gt;&lt;BR /&gt;Compare 16 and 32 processes performance.&lt;BR /&gt;If performancewith 32 processes is worse, please run it with I_MPI_DEBUG=5 and let me know the output. This debug level will give pinning table and can give you a clue.&lt;BR /&gt;&lt;BR /&gt;Best wishes,&lt;BR /&gt; Dmitry&lt;BR /&gt;</description>
    <pubDate>Wed, 27 May 2009 06:34:09 GMT</pubDate>
    <dc:creator>Dmitry_K_Intel2</dc:creator>
    <dc:date>2009-05-27T06:34:09Z</dc:date>
    <item>
      <title>Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896937#M2115</link>
      <description>Dear colleagues,&lt;BR /&gt;&lt;BR /&gt;we are working in an Infiniband DDR cluster with MM5. We are using the latest Intel MPI and Fortran, and our mm5.mpp has been compiled with the configuration suggested in this website.&lt;BR /&gt;&lt;BR /&gt;This is the way we launch :&lt;BR /&gt;[c2@ Run]$ time /intel/impi/3.2.1.009/bin64/mpiexec -genv I_MPI_PIN_PROCS 0-7 -np 32 -env I_MPI_DEVICE rdma ./mm5.mpp&lt;BR /&gt;&lt;BR /&gt;Everything seems to be ok, but when we launch np 16 with mpiexec it's 75% better performance than gigabit, but whe we are using more than 16 np, the scaling is worst. We have noticed the main difference between the way of processing of gigabit and infiniband is :&lt;BR /&gt;&lt;BR /&gt;- Infiniband only uses all the cores when the np is 16 or lower, when it grows, it only uses 3 cores in a machine&lt;BR /&gt;- Gigabit always uses all the cores in all machines.&lt;BR /&gt;&lt;BR /&gt;We have try a lot Intel MMPI variables in the execution , for example I_MPI_PIN, but there is no way to manage the situation. The MPI universe is working ok with the infiniband networks, and we use the MPI_DEVICE rdma. The infiniband network is working ok ( performance and so on) because we have passed some benchmarks and the results are working fine.&lt;BR /&gt;&lt;BR /&gt;What do you think about it ? Could it be consequence of the model we are using to compare performance ?&lt;BR /&gt;&lt;BR /&gt;Thanks a lot and best regards</description>
      <pubDate>Tue, 26 May 2009 07:58:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896937#M2115</guid>
      <dc:creator>jriocaton_es</dc:creator>
      <dc:date>2009-05-26T07:58:06Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896938#M2116</link>
      <description>Sorry, I forgot this :&lt;BR /&gt;&lt;BR /&gt;[c2@Run]$ more ../../../mpd.hosts&lt;BR /&gt;infi1:8 ifhn=192.168.10.1&lt;BR /&gt;infi6:8 ifhn=192.168.10.250 &lt;BR /&gt;infi7:8 ifhn=192.168.10.249&lt;BR /&gt;infi8:8 ifhn=192.168.10.248 &lt;BR /&gt;infi9:8 ifhn=192.168.10.247 &lt;BR /&gt;infi10:8 ifhn=192.168.10.246 &lt;BR /&gt;infi4:8 ifhn=192.168.10.252 &lt;BR /&gt;infi5:8 ifhn=192.168.10.251 &lt;BR /&gt;&lt;BR /&gt;[c2@Run]$ /intel/impi/3.2.1.009/bin64/mpdboot -n 8 --ifhn=192.168.10.1 -f mpd.hosts -r ssh --verbose&lt;BR /&gt;&lt;BR /&gt;[c2@Run]$ /intel/impi/3.2.1.009/bin64/mpdtrace -l&lt;BR /&gt;infi1_33980 (192.168.10.1)&lt;BR /&gt;infi8_50855 (192.168.10.248)&lt;BR /&gt;infi9_39762 (192.168.10.247)&lt;BR /&gt;infi7_44185 (192.168.10.249)&lt;BR /&gt;infi6_37134 (192.168.10.250)&lt;BR /&gt;infi4_55533 (192.168.10.252)&lt;BR /&gt;infi5_42161 (192.168.10.251)&lt;BR /&gt;infi10_33666 (192.168.10.246)&lt;BR /&gt;&lt;BR /&gt;[c2@Run]$ /intel/impi/3.2.1.009/bin64/mpiexec -genv I_MPI_PIN_PROCS 0-7 -np 32 -env I_MPI_DEVICE rdma ./mm5.mpp&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 26 May 2009 08:06:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896938#M2116</guid>
      <dc:creator>jriocaton_es</dc:creator>
      <dc:date>2009-05-26T08:06:12Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896939#M2117</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
You would be more likely to get the attention of cluster computing experts if you discussed this on the HPC forum.&lt;BR /&gt;The I_MPI_PIN_PROCS setting probably isn't useful, although it would have been required by earlier MPI versions. In fact, it over-rides the built-in optimizations which Intel MPI has for platforms such as Harpertown, where the cores aren't numbered in sequence. That might make a difference if you enabled shared memory message passing.&lt;BR /&gt;Usually, a combined Infiniband/shared memory option (rdms should be the default) scales to larger number of nodes and processes. I wonder if you allowed shared memory in your gigabit choice.&lt;BR /&gt;You didn't tell enough about your hardware (CPU type, how much RAM) for much assistance to be given, not that anyone such as myself who isn't familar with mm5 would know its memory requirement.&lt;BR /&gt;</description>
      <pubDate>Tue, 26 May 2009 12:56:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896939#M2117</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-05-26T12:56:01Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896940#M2118</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;Hi Tim, I've just change the options and rdsm hasnt solved the problem. The behaviour is the same. Talking about memory, is not a limitation, and the procs are Xeon 54XX. The comparison gigabit vs infiniband is the following :&lt;BR /&gt;16 proc gigabit -&amp;gt; 2'&lt;BR /&gt;16 proc infi -&amp;gt; 20"'&lt;BR /&gt;32 proc gigabit -&amp;gt; 4'&lt;BR /&gt;32 proc infi -&amp;gt; 5'20"&lt;BR /&gt;Do you think I should change the forum ? Is there any var I could use or anything I could test to improve the performance ?&lt;BR /&gt;Thanks &lt;BR /&gt;&lt;BR /&gt;PD. This is the top in a node when we launch 32 proc :&lt;BR /&gt;&lt;BR /&gt;Cpu0  : 20.7%us, 79.3%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st&lt;BR /&gt;Cpu1  :  0.0%us,100.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st&lt;BR /&gt;Cpu2  :  0.3%us, 99.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st&lt;BR /&gt;Cpu3  :  0.3%us, 99.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st&lt;BR /&gt;Cpu4  :  0.3%us, 99.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st&lt;BR /&gt;Cpu5  :  3.0%us, 97.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st&lt;BR /&gt;Cpu6  : 49.5%us, 50.5%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st&lt;BR /&gt;Cpu7  : 17.0%us, 83.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st&lt;BR /&gt;Mem:   8170628k total,  1038224k used,  7132404k free,   246276k buffers&lt;BR /&gt;Swap:  1020116k total,        0k used,  1020116k free,   352364k cached&lt;BR /&gt;&lt;BR /&gt; PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                              &lt;BR /&gt; 6853 caton2    25   0 77456  33m 5624 R  100  0.4  17:00.01 mm5.mpp                                                                                                              &lt;BR /&gt; 6854 caton2    25   0 78664  34m 5880 S  100  0.4  16:59.99 mm5.mpp                                                                                                              &lt;BR /&gt; 6855 caton2    25   0 78212  34m 5764 S  100  0.4  16:59.23 mm5.mpp                                                                                                              &lt;BR /&gt; 6852 caton2    25   0 77388  33m 5224 S  100  0.4  16:59.95 mm5.mpp                                                                                                              &lt;BR /&gt; 6856 caton2    25   0 79232  34m 6068 R  100  0.4  16:58.75 mm5.mpp                                                                                                              &lt;BR /&gt; 6857 caton2    25   0 78152  34m 5800 R  100  0.4  17:00.00 mm5.mpp                                                                                                              &lt;BR /&gt; 6858 caton2    25   0 77620  33m 5684 S  100  0.4  16:59.99 mm5.mpp                                                                                                              &lt;BR /&gt; 6859 caton2    25   0 77268  33m 5136 R  100  0.4  16:59.71 mm5.mpp         &lt;BR /&gt;&lt;BR /&gt;As you can see, CPU0,6,7 are running user proc, while the CPU1,2,3,4,5 are always busy with the system ones. Always is the samen situation. Is there anyway to get down the system ones at CPU1,2,3,4,5 ?&lt;BR /&gt;</description>
      <pubDate>Tue, 26 May 2009 13:28:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896940#M2118</guid>
      <dc:creator>jriocaton_es</dc:creator>
      <dc:date>2009-05-26T13:28:44Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896941#M2119</link>
      <description>&lt;DIV style="margin:0px;"&gt;Hi jriocaton,&lt;BR /&gt;&lt;BR /&gt;I'd suggest simplifying your mpd.hosts and command lines.&lt;BR /&gt;1) Could you try to remove ifhn=192.168.10.xxx from mpd.hosts file?&lt;/DIV&gt;
2) remove '--ifhn=192.168.10.1' from mpdboot command.&lt;BR /&gt;3) remove '-genv I_MPI_PIN_PROCS 0-7' from mpiexec command line&lt;BR /&gt;4) try to use rdssm istead of rdma in '-env I_MPI_DEVICE rdma'&lt;BR /&gt;&lt;BR /&gt;Of cause you need to kill all mpd process before ('mpdallexit').&lt;BR /&gt;&lt;BR /&gt;Intel MPI can so called fallback technique - to disable it set I_MPI_FALLBACK_DEVICE to 0.&lt;BR /&gt;&lt;BR /&gt;Compare 16 and 32 processes performance.&lt;BR /&gt;If performancewith 32 processes is worse, please run it with I_MPI_DEBUG=5 and let me know the output. This debug level will give pinning table and can give you a clue.&lt;BR /&gt;&lt;BR /&gt;Best wishes,&lt;BR /&gt; Dmitry&lt;BR /&gt;</description>
      <pubDate>Wed, 27 May 2009 06:34:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896941#M2119</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2009-05-27T06:34:09Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896942#M2120</link>
      <description>Dear Dmitry,&lt;BR /&gt;&lt;BR /&gt;thanks a lot for your interest.&lt;BR /&gt;&lt;BR /&gt;I've done the the changes you told me and the results are the same. This is the debug :&lt;BR /&gt;&lt;BR /&gt;&lt;EM&gt;[c2@quijote Run]$ time /intel/impi/3.2.1.009/bin64/mpiexec -genv I_MPI_FALLBACK_DEVICE 1 -np 32 -env I_MPI_DEBUG 5 -env I_MPI_DEVICE rdssm ./mm5.mpp&lt;BR /&gt;[1] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 1:infi1&lt;BR /&gt;[2] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 2:infi1&lt;BR /&gt;[4] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 4:infi1&lt;BR /&gt;[5] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 5:infi1&lt;BR /&gt;[6] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 6:infi1&lt;BR /&gt;[3] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 3:infi1&lt;BR /&gt;[9] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 9:infi9&lt;BR /&gt;[10] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 10:infi9&lt;BR /&gt;[7] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 7:infi1&lt;BR /&gt;[8] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 8:infi9&lt;BR /&gt;[12] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 12:infi9&lt;BR /&gt;[15] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 15:infi9&lt;BR /&gt;[13] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 13:infi9&lt;BR /&gt;[14] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 14:infi9&lt;BR /&gt;[17] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 17:infi8&lt;BR /&gt;[19] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 19:infi8&lt;BR /&gt;[18] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 18:infi8&lt;BR /&gt;[20] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 20:infi8&lt;BR /&gt;[21] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 21:infi8&lt;BR /&gt;[22] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 22:infi8&lt;BR /&gt;[11] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 11:infi9&lt;BR /&gt;[24] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 24:infi7&lt;BR /&gt;[16] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 16:infi8&lt;BR /&gt;[27] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 27:infi7&lt;BR /&gt;[29] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 29:infi7&lt;BR /&gt;[30] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 30:infi7&lt;BR /&gt;[31] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 31:infi7&lt;BR /&gt;[23] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 23:infi8&lt;BR /&gt;[25] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 25:infi7&lt;BR /&gt;[26] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 26:infi7&lt;BR /&gt;[28] MPI startup(): DAPL provider &lt;NULL on="" rank="" infi1="" differs="" from=""&gt;&lt;NULL string=""&gt; on rank 28:infi7&lt;BR /&gt;[0] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[1] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[2] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[3] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[4] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[5] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[6] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[8] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[7] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[9] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[10] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[11] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[13] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[12] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[14] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[15] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[17] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[19] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[16] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[18] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[20] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[22] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[21] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[23] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[24] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[25] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[27] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[28] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[29] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[26] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[30] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[31] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;[5] MPI Startup(): process is pinned to CPU05 on node quijote.cluster&lt;BR /&gt;[2] MPI Startup(): process is pinned to CPU02 on node quijote.cluster&lt;BR /&gt;[6] MPI Startup(): process is pinned to CPU03 on node quijote.cluster&lt;BR /&gt;[1] MPI Startup(): process is pinned to CPU04 on node quijote.cluster&lt;BR /&gt;[4] MPI Startup(): process is pinned to CPU01 on node quijote.cluster&lt;BR /&gt;[0] MPI Startup(): process is pinned to CPU00 on node quijote.cluster&lt;BR /&gt;[9] MPI Startup(): process is pinned to CPU04 on node compute-0-7.local&lt;BR /&gt;[10] MPI Startup(): [8] MPI Startup(): process is pinned to CPU00 on node compute-0-7.local&lt;BR /&gt;quijote.cluster -- rsl_nproc_all 32, rsl_myproc 1&lt;BR /&gt;[3] MPI Startup(): process is pinned to CPU02 on node compute-0-7.localprocess is pinned to CPU06 on node quijote.cluster[12] MPI Startup(): process is pinned to CPU01 on node compute-0-7.local&lt;BR /&gt;[14] MPI Startup(): process is pinned to CPU03 on node compute-0-7.local&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;compute-0-7.local -- rsl_nproc_all 32, rsl_myproc 9&lt;BR /&gt;[11] MPI Startup(): process is pinned to CPU06 on node compute-0-7.local[13] MPI Startup(): process is pinned to CPU05 on node compute-0-7.local&lt;BR /&gt;[15] MPI Startup(): quijote.cluster -- rsl_nproc_all 32, rsl_myproc 5&lt;BR /&gt;compute-0-7.local -- rsl_nproc_all 32, rsl_myproc 13&lt;BR /&gt;compute-0-7.local -- rsl_nproc_all 32, rsl_myproc 14&lt;BR /&gt;quijote.cluster -- rsl_nproc_all 32, rsl_myproc 3&lt;BR /&gt;compute-0-7.local -- rsl_nproc_all 32, rsl_myproc 10&lt;BR /&gt;compute-0-7.local -- rsl_nproc_all 32, rsl_myproc 11&lt;BR /&gt;compute-0-7.local -- rsl_nproc_all 32, rsl_myproc 12&lt;BR /&gt;&lt;BR /&gt;process is pinned to CPU07 on node compute-0-7.local&lt;BR /&gt;compute-0-7.local -- rsl_nproc_all 32, rsl_myproc 8&lt;BR /&gt;compute-0-7.local -- rsl_nproc_all 32, rsl_myproc 15&lt;BR /&gt;quijote.cluster -- rsl_nproc_all 32, rsl_myproc 2&lt;BR /&gt;[7] MPI Startup(): process is pinned to CPU07 on node quijote.cluster&lt;BR /&gt;quijote.cluster -- rsl_nproc_all 32, rsl_myproc 6&lt;BR /&gt;quijote.cluster -- rsl_nproc_all 32, rsl_myproc 7&lt;BR /&gt;quijote.cluster -- rsl_nproc_all 32, rsl_myproc 4&lt;BR /&gt;[19] MPI Startup(): process is pinned to CPU06 on node compute-0-6.local&lt;BR /&gt;compute-0-6.local -- rsl_nproc_all 32, rsl_myproc 19&lt;BR /&gt;[17] MPI Startup(): process is pinned to CPU04 on node compute-0-6.local&lt;BR /&gt;[20] MPI Startup(): process is pinned to CPU01 on node compute-0-6.local&lt;BR /&gt;[23] MPI Startup(): process is pinned to CPU07 on node compute-0-6.local&lt;BR /&gt;compute-0-6.local -- rsl_nproc_all 32, rsl_myproc 17&lt;BR /&gt;[25] MPI Startup(): process is pinned to CPU04 on node compute-0-5.local&lt;BR /&gt;[29] MPI Startup(): process is pinned to CPU05 on node compute-0-5.local&lt;BR /&gt;compute-0-5.local -- rsl_nproc_all 32, rsl_myproc 25&lt;BR /&gt;[27] MPI Startup(): process is pinned to CPU06 on node compute-0-5.local&lt;BR /&gt;[24] MPI Startup(): process is pinned to CPU00 on node compute-0-5.local&lt;BR /&gt;[30] MPI Startup(): process is pinned to CPU03 on node compute-0-5.local&lt;BR /&gt;compute-0-5.local -- rsl_nproc_all 32, rsl_myproc 27&lt;BR /&gt;compute-0-5.local -- rsl_nproc_all 32, rsl_myproc 29&lt;BR /&gt;[28] MPI Startup(): process is pinned to CPU01 on node compute-0-5.local&lt;BR /&gt;[26] MPI Startup(): process is pinned to CPU02 on node compute-0-5.local&lt;BR /&gt;compute-0-5.local -- rsl_nproc_all 32, rsl_myproc 28&lt;BR /&gt;compute-0-5.local -- rsl_nproc_all 32, rsl_myproc 30&lt;BR /&gt;[31] MPI Startup(): process is pinned to CPU07 on node compute-0-5.local&lt;BR /&gt;compute-0-5.local -- rsl_nproc_all 32, rsl_myproc 31&lt;BR /&gt;compute-0-5.local -- rsl_nproc_all 32, rsl_myproc 24&lt;BR /&gt;compute-0-5.local -- rsl_nproc_all 32, rsl_myproc 26&lt;BR /&gt;compute-0-6.local -- rsl_nproc_all 32, rsl_myproc 23&lt;BR /&gt;[16] MPI Startup(): process is pinned to CPU00 on node compute-0-6.local&lt;BR /&gt;compute-0-6.local -- rsl_nproc_all 32, rsl_myproc 20&lt;BR /&gt;[0] Rank    Pid      Node name          Pin cpu&lt;BR /&gt;[0] 0       31791    quijote.cluster    0&lt;BR /&gt;[0] 1       31784    quijote.cluster    4&lt;BR /&gt;[0] 2       31785    quijote.cluster    2&lt;BR /&gt;[0] 3       31786    quijote.cluster    6&lt;BR /&gt;[0] 4       31787    quijote.cluster    1&lt;BR /&gt;[0] 5       31788    quijote.cluster    5&lt;BR /&gt;[0] 6       31789    quijote.cluster    3&lt;BR /&gt;[0] 7       31790    quijote.cluster    7&lt;BR /&gt;[0] 8       23649    compute-0-7.local  0&lt;BR /&gt;[0] 9       23656    compute-0-7.local  4&lt;BR /&gt;[0] 10      23650    compute-0-7.local  2&lt;BR /&gt;[0] 11      23652    compute-0-7.local  6&lt;BR /&gt;[0] 12      23651    compute-0-7.local  1&lt;BR /&gt;[0] 13      23654    compute-0-7.local  5&lt;BR /&gt;[0] 14      23653    compute-0-7.local  3&lt;BR /&gt;[0] 15      23655    compute-0-7.local  7&lt;BR /&gt;[0] 16      10775    compute-0-6.local  0&lt;BR /&gt;[0] 17      10776    compute-0-6.local  4&lt;BR /&gt;[0] 18      10777    compute-0-6.local  2&lt;BR /&gt;[0] 19      10778    compute-0-6.local  6&lt;BR /&gt;[0] 20      10779    compute-0-6.local  1&lt;BR /&gt;[0] 21      10780    compute-0-6.local  5&lt;BR /&gt;[0] 22      10781    compute-0-6.local  3&lt;BR /&gt;[0] 23      10782    compute-0-6.local  7&lt;BR /&gt;[0] 24      20680    compute-0-5.local  0&lt;BR /&gt;[0] 25      20681    compute-0-5.local  4&lt;BR /&gt;[0] 26      20682    compute-0-5.local  2&lt;BR /&gt;[0] 27      20683    compute-0-5.local  6&lt;BR /&gt;[0] 28      20684    compute-0-5.local  1&lt;BR /&gt;[0] 29      20685    compute-0-5.local  5&lt;BR /&gt;[0] 30      20686    compute-0-5.local  3&lt;BR /&gt;[0] 31      20687    compute-0-5.local  7&lt;BR /&gt;[0] Init(): I_MPI_DEBUG=5&lt;BR /&gt;[0] Init(): I_MPI_DEVICE=rdssm&lt;BR /&gt;[0] Init(): I_MPI_FALLBACK_DEVICE=1&lt;BR /&gt;[0] Init(): MPICH_INTERFACE_HOSTNAME=192.168.10.1&lt;BR /&gt;[22] MPI Startup(): process is pinned to CPU03 on node compute-0-6.local&lt;BR /&gt;[21] MPI Startup(): process is pinned to CPU05 on node compute-0-6.local&lt;BR /&gt;compute-0-6.local -- rsl_nproc_all 32, rsl_myproc 16&lt;BR /&gt;compute-0-6.local -- rsl_nproc_all 32, rsl_myproc 22&lt;BR /&gt;compute-0-6.local -- rsl_nproc_all 32, rsl_myproc 21&lt;BR /&gt;compute-0-6.local -- rsl_nproc_all 32, rsl_myproc 18&lt;BR /&gt;[18] MPI Startup(): process is pinned to CPU02 on node compute-0-6.local&lt;BR /&gt;quijote.cluster -- rsl_nproc_all 32, rsl_myproc 0&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;BR /&gt;When I try to set the "I_MPI_FALLBACK_DEVICE 0", I get these errors :&lt;BR /&gt;&lt;BR /&gt;&lt;EM&gt;[c2@Run]$ /intel/impi/3.2.1.009/bin64/mpiexec -genv I_MPI_FALLBACK_DEVICE 0  -np 32 -env I_MPI_DEVICE rdssm ./mm5.mpp&lt;BR /&gt;[1] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_1]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[3] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[0] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_0]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[2] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_2]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[4] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_4]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[5] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_5]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[6] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_6]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[9] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_9]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[8] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[13] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_13]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[10] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_10]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[12] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_12]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[11] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_11]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[14] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_14]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[15] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_15]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[16] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_16]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[20] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_20]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[19] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_19]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[21] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_21]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[22] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_22]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[25] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[24] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_24]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[26] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_26]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[27] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_27]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[cli_25]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[28] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_28]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;[29] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_29]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;rank 27 in job 2  infi1_60665   caused collective abort of all ranks&lt;BR /&gt; exit status of rank 27: return code 13 &lt;BR /&gt;[30] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;[cli_30]: aborting job:&lt;BR /&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;BR /&gt;MPIR_Init_thread(283): Initialization failed&lt;BR /&gt;MPIDD_Init(98).......: channel initialization failed&lt;BR /&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;BR /&gt;(unknown)(): &lt;NULL&gt;&lt;BR /&gt;rank 25 in job 2  infi1_60665   caused collective abort of all ranks&lt;BR /&gt; exit status of rank 25: return code 13 &lt;BR /&gt;rank 20 in job 2  infi1_60665   caused collective abort of all ranks&lt;BR /&gt; exit status of rank 20: return code 13 &lt;BR /&gt;rank 19 in job 2  infi1_60665   caused collective abort of all ranks&lt;BR /&gt; exit status of rank 19: return code 13 &lt;BR /&gt;rank 14 in job 2  infi1_60665   caused collective abort of all ranks&lt;BR /&gt; exit status of rank 14: return code 13 &lt;BR /&gt;rank 13 in job 2  infi1_60665   caused collective abort of all ranks&lt;BR /&gt; exit status of rank 13: return code 13 &lt;BR /&gt;rank 12 in job 2  infi1_60665   caused collective abort of all ranks&lt;BR /&gt; exit status of rank 12: return code 13 &lt;BR /&gt;rank 11 in job 2  infi1_60665   caused collective abort of all ranks&lt;BR /&gt; exit status of rank 11: return code 13 &lt;BR /&gt;rank 10 in job 2  infi1_60665   caused collective abort of all ranks&lt;BR /&gt; exit status of rank 10: return code 13 &lt;BR /&gt;rank 4 in job 2  infi1_60665   caused collective abort of all ranks&lt;BR /&gt; exit status of rank 4: killed by signal 9 &lt;BR /&gt;rank 1 in job 2  infi1_60665   caused collective abort of all ranks&lt;BR /&gt; exit status of rank 1: return code 13 &lt;BR /&gt;rank 0 in job 2  infi1_60665   caused collective abort of all ranks&lt;BR /&gt; exit status of rank 0: return code 13 &lt;BR /&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/NULL&gt;&lt;/EM&gt;&lt;BR /&gt;Thanks a lot and best regards&lt;BR /&gt;</description>
      <pubDate>Wed, 27 May 2009 08:07:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896942#M2120</guid>
      <dc:creator>jriocaton_es</dc:creator>
      <dc:date>2009-05-27T08:07:40Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896943#M2121</link>
      <description>&lt;DIV style="margin:0px;"&gt;jriocation,&lt;BR /&gt;&lt;BR /&gt;could you provide your /etc/dat.conf file as well?&lt;BR /&gt;&lt;BR /&gt;Cheers&lt;BR /&gt; Dmitry&lt;/DIV&gt;
&lt;BR /&gt;</description>
      <pubDate>Wed, 27 May 2009 10:51:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896943#M2121</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2009-05-27T10:51:35Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896944#M2122</link>
      <description>&lt;BR /&gt;Thanks again Dmitry, &lt;BR /&gt;&lt;BR /&gt;we have not ever worked with /etc/dat.cof. I have read something about it, but we havent work ...&lt;BR /&gt;&lt;BR /&gt;This is our /usr/bin/lib64&lt;BR /&gt;&lt;BR /&gt;[root@quijote lib64]# ls -la | grep cma&lt;BR /&gt;lrwxrwxrwx  1 root root       19 mar 31 19:11 libdaplcma.so.1 -&amp;gt; libdaplcma.so.1.0.2&lt;BR /&gt;-rwxr-xr-x  1 root root    98560 may 25  2008 libdaplcma.so.1.0.2&lt;BR /&gt;&lt;BR /&gt;Thanks.&lt;BR /&gt;&lt;BR /&gt;Julio.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 27 May 2009 11:09:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896944#M2122</guid>
      <dc:creator>jriocaton_es</dc:creator>
      <dc:date>2009-05-27T11:09:33Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896945#M2123</link>
      <description>&lt;DIV style="margin:0px;"&gt;Hi Julio,&lt;BR /&gt;&lt;BR /&gt;The output:&lt;BR /&gt;[1] DAPL provider is not found and fallback device is not enabled&lt;BR /&gt;when I_MPI_FALLBACK_DEVICE=0 shows that something wrong with your DAPL settings.&lt;BR /&gt;&lt;/DIV&gt;
&lt;BR /&gt;How did you switch between Infiniband and Gigabit Ethernet?&lt;BR /&gt;&lt;BR /&gt;Set I_MPI_FALLBACK_DEVICE=1 and try to run 'mpiexec' for both devices with I_MPI_DEBUG=2.&lt;BR /&gt;Message like:&lt;BR /&gt;[0] MPI startup(): shared memory and socket data transfer modes&lt;BR /&gt;Shows you real transfer mode.&lt;BR /&gt;&lt;BR /&gt;Iwould recommend to use OFED version 1.4.1.&lt;BR /&gt;&lt;BR /&gt;Best wishes,&lt;BR /&gt; Dmitry&lt;BR /&gt;</description>
      <pubDate>Wed, 27 May 2009 13:02:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896945#M2123</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2009-05-27T13:02:41Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896946#M2124</link>
      <description>&lt;DIV style="margin:0px;"&gt;I meant a command line like:&lt;BR /&gt; mpiexec -genv I_MPI_FALLBACK_DEVICE 1 -np16 -env I_MPI_DEBUG2 -env I_MPI_DEVICE rdma ./mm5.mpp&lt;/DIV&gt;
&lt;BR /&gt;Both for IB and GigaEth&lt;BR /&gt;</description>
      <pubDate>Wed, 27 May 2009 13:09:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896946#M2124</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2009-05-27T13:09:06Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896947#M2125</link>
      <description>Thanks again Dmitry,&lt;BR /&gt;&lt;BR /&gt;we have downloaded the OFED from the QLOGIC website ( we are using CentOS 5.1 as O.S.). What do you recommend to install ?? There are a lot of packages.&lt;BR /&gt;&lt;BR /&gt;Thanks a lot and best regards&lt;BR /&gt;</description>
      <pubDate>Wed, 27 May 2009 14:07:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896947#M2125</guid>
      <dc:creator>jriocaton_es</dc:creator>
      <dc:date>2009-05-27T14:07:47Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896948#M2126</link>
      <description>&lt;DIV style="margin:0px;"&gt;Hi Julio,&lt;BR /&gt;&lt;BR /&gt;Seems that something wrong with cluster settings. I'm afraid that rdma didn't work at all.&lt;BR /&gt;It's very strange that there is no /etc/dat.conf file. I'm not familiar with Qlogic devices but they should provide drives for their cards. Please try to find dat.conf file - it has to exist. Move it (or create a link) to /etc directory on all nodes.&lt;BR /&gt;dat.conf file should contain lines like:&lt;BR /&gt;
&lt;PRE&gt;OpenIB-cma-1 u1.2 nonthreadsafe default /usr/lib/libdaplcma.so dapl.1.2 "ib1 0" ""&lt;/PRE&gt;
&lt;BR /&gt;Qlogic should provide an utility which can prove that InfiniBand card works correct. Try to check that devices work as expected on all nodes.&lt;BR /&gt;&lt;BR /&gt;mpd.hosts should look like:&lt;BR /&gt; infi1:8&lt;BR /&gt;&lt;BR /&gt;Be sure that there are no I_MPI env variablesset from previuos attemps.&lt;BR /&gt;&lt;BR /&gt;Start mpdring:&lt;BR /&gt; mpdboot -n 8 -f mpd.hosts -r ssh --verbose&lt;BR /&gt;&lt;BR /&gt;Start your application:&lt;BR /&gt; mpiexec-np16 -env I_MPI_DEBUG2 -env I_MPI_DEVICE rdma ./mm5.mpp&lt;BR /&gt;&lt;BR /&gt;And please attach the ouptut both for InfiniBand and for gigabit ethernet. This debug level(2) will show what device has been chosen.&lt;BR /&gt;&lt;BR /&gt;Best wishes,&lt;BR /&gt; Dmitry&lt;BR /&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 28 May 2009 06:52:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896948#M2126</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2009-05-28T06:52:58Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896949#M2127</link>
      <description>Dear Dmitry,&lt;BR /&gt;&lt;BR /&gt;im sorry for the delay, but I was travelling out of the office.&lt;BR /&gt;&lt;BR /&gt;We could solve the problem updating the drivers and libraries. Thanks a lot for your help&lt;BR /&gt;</description>
      <pubDate>Wed, 03 Jun 2009 13:27:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896949#M2127</guid>
      <dc:creator>jriocaton_es</dc:creator>
      <dc:date>2009-06-03T13:27:01Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896950#M2128</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/429644"&gt;jriocaton.es&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;Dear Dmitry,&lt;BR /&gt;&lt;BR /&gt;im sorry for the delay, but I was travelling out of the office.&lt;BR /&gt;&lt;BR /&gt;We could solve the problem updating the drivers and libraries. Thanks a lot for your help&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;Hi Julio,&lt;BR /&gt;&lt;BR /&gt;Nice to hear that the problem was resolved!&lt;BR /&gt;Could you provide details about drivers and libraries you have updated? This information can be useful for anybody else.&lt;BR /&gt;&lt;BR /&gt;Best wishes,&lt;BR /&gt; Dmitry&lt;BR /&gt;</description>
      <pubDate>Thu, 04 Jun 2009 06:38:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896950#M2128</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2009-06-04T06:38:35Z</dc:date>
    </item>
    <item>
      <title>Re: Infiniband-Intel MPI Performance MM5</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896951#M2129</link>
      <description>- QLogic OFED+ 1.4.0.1.30&lt;BR /&gt;- QLogic SRP v1.4.0.1.5&lt;BR /&gt;- QLogic VNIC v1.4.0.1.6&lt;BR /&gt;- QLogic IB Tools v4.4.1.0.11&lt;BR /&gt;</description>
      <pubDate>Thu, 04 Jun 2009 14:28:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Infiniband-Intel-MPI-Performance-MM5/m-p/896951#M2129</guid>
      <dc:creator>jriocaton_es</dc:creator>
      <dc:date>2009-06-04T14:28:35Z</dc:date>
    </item>
  </channel>
</rss>

