<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic MPI [Errno 13] Permission denied  in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874765#M1837</link>
    <description>&amp;gt; I'll also try to link dat.conf from /etc/ofed to /etc
&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;nope, linking won't work :/&lt;/SPAN&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;/DIV&gt;</description>
    <pubDate>Fri, 19 Mar 2010 10:45:01 GMT</pubDate>
    <dc:creator>Rafał_Błaszczyk</dc:creator>
    <dc:date>2010-03-19T10:45:01Z</dc:date>
    <item>
      <title>[solved] random problems with MPI + DAPL initialization in RedHat 5.4</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874757#M1829</link>
      <description>&lt;PRE&gt;Hi I havesometimes problems with execution of a program with Intel MPI&lt;/PRE&gt;
&lt;PRE&gt;It happens with an error on stderr (or stdout):&lt;/PRE&gt;
&lt;BLOCKQUOTE&gt;
&lt;PRE&gt;problem with execution of   on  wn20:  [Errno 13] Permission denied&lt;/PRE&gt;
&lt;/BLOCKQUOTE&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;What could be a problem?&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;BLOCKQUOTE&gt;
&lt;DIV&gt;here is my ulimit -a:&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV&gt;core file size     (blocks, -c) 0&lt;/DIV&gt;
&lt;DIV&gt;data seg size      (kbytes, -d) unlimited&lt;/DIV&gt;
&lt;DIV&gt;scheduling priority       (-e) 0&lt;/DIV&gt;
&lt;DIV&gt;file size        (blocks, -f) unlimited&lt;/DIV&gt;
&lt;DIV&gt;pending signals         (-i) 135167&lt;/DIV&gt;
&lt;DIV&gt;max locked memory    (kbytes, -l) unlimited&lt;/DIV&gt;
&lt;DIV&gt;max memory size     (kbytes, -m) unlimited&lt;/DIV&gt;
&lt;DIV&gt;open files           (-n) 1024&lt;/DIV&gt;
&lt;DIV&gt;pipe size      (512 bytes, -p) 8&lt;/DIV&gt;
&lt;DIV&gt;POSIX message queues   (bytes, -q) 819200&lt;/DIV&gt;
&lt;DIV&gt;real-time priority       (-r) 0&lt;/DIV&gt;
&lt;DIV&gt;stack size       (kbytes, -s) unlimited&lt;/DIV&gt;
&lt;DIV&gt;cpu time        (seconds, -t) unlimited&lt;/DIV&gt;
&lt;DIV&gt;max user processes       (-u) 135167&lt;/DIV&gt;
&lt;DIV&gt;virtual memory     (kbytes, -v) unlimited&lt;/DIV&gt;
&lt;DIV&gt;file locks           (-x) unlimited&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/BLOCKQUOTE&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;BLOCKQUOTE&gt;
&lt;DIV&gt;I've checked logs on this node (wn20):&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Mar 8 16:11:52 wn20 mpd: mpd starting; no mpdid yet&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Mar 8 16:11:52 wn20 mpd: mpd has mpdid=wn20_45723 (port=45723)&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Mar 8 16:11:53 wn20 mpd: wn20_45723 (run 1485): Warning: the directory pointed by TMPDIR (/tmp/pbs.2045.mgmt1) does not exist! /tmp will be used.&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Mar 8 16:11:53 wn20 mpd: wn20_45723 (__init__ 1045): Warning: the directory pointed by TMPDIR (/tmp/pbs.2045.mgmt1) does not exist! /tmp will be used.&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Mar 8 16:11:53 wn20 sshd[11867]: pam_unix(sshd:session): session closed for user routnwp&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_120&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_121&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_122&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_123&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_124&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_125&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_126&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_127&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Mar 8 16:12:07 wn20 mpd: mpd ending mpdid=wn20_45723 (inside cleanup)&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Mar 8 16:11:52 wn20 mpd: mpd starting; no mpdid yet&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Mar 8 16:11:52 wn20 mpd: mpd has mpdid=wn20_45723 (port=45723)&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Mar 8 16:11:53 wn20 mpd: wn20_45723 (run 1485): Warning: the directory pointed by TMPDIR (/tmp/pbs.2045.mgmt1) does not exist! /tmp will be used.&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Mar 8 16:11:53 wn20 mpd: wn20_45723 (__init__ 1045): Warning: the directory pointed by TMPDIR (/tmp/pbs.2045.mgmt1) does not exist! /tmp will be used.&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Mar 8 16:11:53 wn20 sshd[11867]: pam_unix(sshd:session): session closed for user routnwpMar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_120&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_121&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_122&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_123&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_124&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_125&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_126&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Mar 8 16:11:57 wn20 mpdman: mpdman starting new log; wn20_mpdman_127&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Mar 8 16:12:07 wn20 mpd: mpd ending mpdid=wn20_45723 (inside cleanup)&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/BLOCKQUOTE&gt;</description>
      <pubDate>Tue, 09 Mar 2010 10:34:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874757#M1829</guid>
      <dc:creator>Rafał_Błaszczyk</dc:creator>
      <dc:date>2010-03-09T10:34:32Z</dc:date>
    </item>
    <item>
      <title>MPI [Errno 13] Permission denied</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874758#M1830</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;Could you provide command line and output in verbose mode if possible.&lt;/P&gt;
&lt;P&gt;Regards!&lt;/P&gt;
&lt;P&gt;Dmitry&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 09 Mar 2010 12:14:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874758#M1830</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2010-03-09T12:14:06Z</dc:date>
    </item>
    <item>
      <title>MPI [Errno 13] Permission denied</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874759#M1831</link>
      <description>&lt;P&gt;&amp;gt;Presumably meaning with environment variable I_MPI_DEBUG=9&lt;/P&gt;
&lt;P&gt;It seems to me that the issue is related to mpdboot (or mpirun) so this is '--verbose' option for this command.&lt;/P&gt;
&lt;P&gt;Regards!&lt;/P&gt;
&lt;P&gt;Dmitry&lt;/P&gt;</description>
      <pubDate>Tue, 09 Mar 2010 13:57:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874759#M1831</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2010-03-09T13:57:56Z</dc:date>
    </item>
    <item>
      <title>MPI [Errno 13] Permission denied</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874760#M1832</link>
      <description>Hi Dmitry,
&lt;DIV&gt;thanks for tip. Unfortunately I cannot reproduce the problem.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;I've raised I_MPI_DEBUG to 5 as said in documentation, will 9 give more verbosity?&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;The problem is we've got now (with I_MPI_DEBUG=5):&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV id="_mcePaste"&gt;[56] MPI startup(): DAPL provider OpenIB-cma specified in DAPL configuration file /etc/dat.conf&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[cli_56]: got unexpected response to get :cmd=get kvsname=kvs_wn3_49596_0_0 key=DAPL_MISMATCH&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[cli_56]: got unexpected response to put :cmd=put kvsname=kvs_wn3_49596_0_0 key=P56-businesscard value=rdma_port#21114$rdma_host#2:0:0:192:168:20:10:0:0:0:0:0:0:0:0$&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[cli_56]: aborting job:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIR_Init_thread(283)...: Initialization failed&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIDD_Init(98)..........: channel initialization failed&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIDI_CH3_Init(261).....:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIDI_CH3U_Init_rdma(64): PMI_KVS_Put returned -1&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
[56] MPI startup(): DAPL provider OpenIB-cma specified in DAPL configuration file /etc/dat.conf[cli_56]: got unexpected response to get :cmd=get kvsname=kvs_wn3_49596_0_0 key=DAPL_MISMATCH:[cli_56]: got unexpected response to put :cmd=put kvsname=kvs_wn3_49596_0_0 key=P56-businesscard value=rdma_port#21114$rdma_host#2:0:0:192:168:20:10:0:0:0:0:0:0:0:0$:[cli_56]: aborting job:Fatal error in MPI_Init: Other MPI error, error stack:MPIR_Init_thread(283)...: Initialization failedMPIDD_Init(98)..........: channel initialization failedMPIDI_CH3_Init(261).....:MPIDI_CH3U_Init_rdma(64): PMI_KVS_Put returned -1&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;what could be a possible problem?&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;here is also the output of env from one of mpi processess (I'm running bash script in mpirun to debug it more closely at MPI process level)&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV&gt;I_MPI_INFO_LCPU=16&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_INFO_SIGN=67237&lt;/DIV&gt;
&lt;DIV&gt;VT_MPI=impi3&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_INFO_PACK=1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_PIN_MAP=56 1,57 5,58 3,59 7,60 0,61 4,62 2,63 6&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_PIN_INFO=6&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_INFO_CACHE_SHARE=2,2,16&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_PIN_UNIT=6&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_INFO_THREAD=0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_INFO_CACHES=3&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_INFO_CORE=0,0,2,2,1,1,3,3,0,0,2,2,1,1,3,3&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_DEVICE=rdma&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_RDMA_EAGER_THRESHOLD=25972&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_INFO_CACHE_SIZE=32768,262144,8388608&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_DEBUG=5&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_INFO_CACHE1=8,0,10,2,9,1,11,3,8,0,10,2,9,1,11,3&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_PIN_MAP_SIZE=8&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_INFO_CACHE2=8,0,10,2,9,1,11,3,8,0,10,2,9,1,11,3&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_INFO_CACHE3=1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_PERHOST=allcores&lt;/DIV&gt;
&lt;DIV&gt;MPICH_INTERFACE_HOSTNAME=192.168.0.10&lt;/DIV&gt;
&lt;DIV&gt;I_MPI_ROOT=/opt/intel/impi/3.2.1.009&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;One more thing, we run it through batch scheduler. After running task I saw that mpd process exist - could it be somewhat connected with the problem:&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;python /opt/intel/impi/3.2.1.009/bin64/mpd.py -h wn9 -p 34585 --ifhn=192.168.0.13 --ncpus=1 --myhost=wn13 --myip=192.168.0.13 -e -d -s 5&lt;/DIV&gt;</description>
      <pubDate>Wed, 17 Mar 2010 10:18:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874760#M1832</guid>
      <dc:creator>Rafał_Błaszczyk</dc:creator>
      <dc:date>2010-03-17T10:18:45Z</dc:date>
    </item>
    <item>
      <title>MPI [Errno 13] Permission denied</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874761#M1833</link>
      <description>This problem is most likely related to configuration of OFED or IP addresses for IPoIB.&lt;BR /&gt;&lt;BR /&gt;Again, I don't see your command line - it might be useful in some cases.&lt;BR /&gt;What is your DAPL version (run 'ofed_info' command)?&lt;BR /&gt;Could you provide /etc/dat.conf?&lt;BR /&gt;What interconnect cards do you use?&lt;BR /&gt;&lt;BR /&gt;The higher number for I_MPI_DEBUG you set the more information you get.&lt;BR /&gt;&lt;BR /&gt;Please try to run you application with I_MPI_DEVICE set to 'sock'.&lt;BR /&gt;&lt;BR /&gt;Regards!&lt;BR /&gt; Dmitry</description>
      <pubDate>Wed, 17 Mar 2010 11:17:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874761#M1833</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2010-03-17T11:17:21Z</dc:date>
    </item>
    <item>
      <title>MPI [Errno 13] Permission denied</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874762#M1834</link>
      <description>&lt;DIV&gt;&amp;gt;This problem is most likely related to configuration of OFED or IP addresses for IPoIB.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;I'll check that, thanks. The problem is that it happens randomly only in particular jobs and the configuration is static...&lt;/DIV&gt;
&lt;BR /&gt;
&lt;DIV&gt;My command line is&lt;/DIV&gt;
&lt;DIV&gt;mpirun -r ssh -env I_MPI_DEBUG 5 -env I_MPI_DEVICE rdssm -np 196 /full/path/bin/cm_w_00.0.0.2.sh&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;where  cm_w_00.0.0.2.sh contains&lt;/DIV&gt;
&lt;DIV&gt;/full/path/bin/cm &amp;gt; $logbin 2&amp;gt;&amp;amp;1&lt;/DIV&gt;
&lt;DIV&gt;and other few commands redirected logfiles (like $logbin) with names unique to mpiprocess to gather debugging data like output of ps, ulimit etc. but there is nothing interesting in those logs&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;I'm using stock RHEL5.4 OFED&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;dapl is dapl-2.0.19-2.el5 from repos, I do not have ofed_info command&lt;/DIV&gt;
&lt;DIV&gt;/etc/dat.conf is/etc/ofed/dat.conf in RHEL:&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV id="_mcePaste"&gt;ofa-v2-ib0 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib0 0" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;ofa-v2-ib1 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib1 0" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;ofa-v2-mthca0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mthca0 1" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;ofa-v2-mthca0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mthca0 2" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;ofa-v2-mlx4_0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 1" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;ofa-v2-mlx4_0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 2" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;ofa-v2-ipath0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "ipath0 1" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;ofa-v2-ipath0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "ipath0 2" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;ofa-v2-ehca0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "ehca0 1" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;ofa-v2-iwarp u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "eth2 0" ""&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
cat /etc/ofed/dat.confofa-v2-ib0 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib0 0" ""ofa-v2-ib1 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib1 0" ""ofa-v2-mthca0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mthca0 1" ""ofa-v2-mthca0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mthca0 2" ""ofa-v2-mlx4_0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 1" ""ofa-v2-mlx4_0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 2" ""ofa-v2-ipath0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "ipath0 1" ""ofa-v2-ipath0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "ipath0 2" ""ofa-v2-ehca0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "ehca0 1" ""ofa-v2-iwarp u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "eth2 0" ""&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;I think about trying sock device but this will rather avoid problem which happens randomly - do you think that's a good idea?&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;I'm using Mellanox ConnectX:&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV&gt;ibv_devinfo&lt;/DIV&gt;
&lt;DIV&gt;hca_id: mlx4_0&lt;/DIV&gt;
&lt;DIV&gt;fw_ver:             2.6.100&lt;/DIV&gt;
&lt;DIV&gt;node_guid:           0023:7dff:ff94:4518&lt;/DIV&gt;
&lt;DIV&gt;sys_image_guid:         0023:7dff:ff94:451b&lt;/DIV&gt;
&lt;DIV&gt;vendor_id:           0x02c9&lt;/DIV&gt;
&lt;DIV&gt;vendor_part_id:         26428&lt;/DIV&gt;
&lt;DIV&gt;hw_ver:             0xA0&lt;/DIV&gt;
&lt;DIV&gt;board_id:            HP_0120000009&lt;/DIV&gt;
&lt;DIV&gt;phys_port_cnt:         2&lt;/DIV&gt;
&lt;DIV&gt;port:  1&lt;/DIV&gt;
&lt;DIV&gt;state:         active (4)&lt;/DIV&gt;
&lt;DIV&gt;max_mtu:        2048 (4)&lt;/DIV&gt;
&lt;DIV&gt;active_mtu:       2048 (4)&lt;/DIV&gt;
&lt;DIV&gt;sm_lid:         6&lt;/DIV&gt;
&lt;DIV&gt;port_lid:        4&lt;/DIV&gt;
&lt;DIV&gt;port_lmc:        0x00&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;port:  2&lt;/DIV&gt;
&lt;DIV&gt;state:         active (4)&lt;/DIV&gt;
&lt;DIV&gt;max_mtu:        2048 (4)&lt;/DIV&gt;
&lt;DIV&gt;active_mtu:       2048 (4)&lt;/DIV&gt;
&lt;DIV&gt;sm_lid:         6&lt;/DIV&gt;
&lt;DIV&gt;port_lid:        5&lt;/DIV&gt;
&lt;DIV&gt;port_lmc:        0x00&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Wed, 17 Mar 2010 14:50:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874762#M1834</guid>
      <dc:creator>Rafał_Błaszczyk</dc:creator>
      <dc:date>2010-03-17T14:50:11Z</dc:date>
    </item>
    <item>
      <title>MPI [Errno 13] Permission denied</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874763#M1835</link>
      <description>Your first output refers to OpenIB-cma provider but there is no such provider in the dat.conf you sent me.&lt;BR /&gt;So you probably need to use DAT_OVERRIDE=/etc/ofed/dat.conf variable to point to the correct dat.conf file.&lt;BR /&gt;Could you also change DEVICE env variable to:&lt;BR /&gt; -env I_MPI_DEVICE rdssm:ofa-v2-mlx4_0-1&lt;BR /&gt;In this case mlx4_0 will be used explicitly &lt;BR /&gt;&lt;BR /&gt;&amp;gt;The problem is that it happens randomly only in particular jobs&lt;BR /&gt;This is very strange. Might be something wrong with cluster configuration or unstable work of some nodes.&lt;BR /&gt;Could you also add:&lt;BR /&gt; -env I_MPI_FALLBACK_DEVICE off&lt;BR /&gt;to your command line.&lt;BR /&gt;&lt;BR /&gt;Let me know the result.&lt;BR /&gt;&lt;BR /&gt;Regards!&lt;BR /&gt; Dmitry&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 17 Mar 2010 15:04:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874763#M1835</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2010-03-17T15:04:58Z</dc:date>
    </item>
    <item>
      <title>MPI [Errno 13] Permission denied</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874764#M1836</link>
      <description>&lt;DIV id="_mcePaste"&gt;I've tried your suggestion, it gave me:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;
&lt;DIV id="_mcePaste"&gt;
&lt;DIV id="_mcePaste"&gt;[0] DAPL provider is not found and fallback device is not enabled&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[cli_0]: aborting job:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIR_Init_thread(283): Initialization failed&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIDD_Init(98).......: channel initialization failed&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIDI_CH3_Init(163)..: generic failure with errno = -1&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;(unknown)(): &lt;NULL&gt;&lt;/NULL&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[0] MPI startup(): Intel MPI Library, Version 3.2.1 Build 20090312&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[0] MPI startup(): Copyright (C) 2003-2009 Intel Corporation. All rights reserved.&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;rank 0 in job 1 wn1_33304  caused collective abort of all ranks&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;exit status of rank 0: return code 13&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;I've checked what is dapl library default dat.conf:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;wn3 ~]$ dapltest&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Dapltest: Service Point Ready - ofa-v2-ib0&lt;/DIV&gt;
wn3 ~]$ dapltestDapltest: Service Point Ready - ofa-v2-ib0
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;I've tried to use the same name with Intel mpirun with the same result -&lt;/DIV&gt;
&lt;DIV&gt;[0] DAPL provider is not found and fallback device is not enabled&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV&gt;The weird thing is when running with just I_MPI_DEVICE=rdssm:&lt;/DIV&gt;
&lt;DIV&gt;[0] MPI startup(): DAPL provider OpenIB-cma specified in DAPL configuration file /etc/dat.conf&lt;/DIV&gt;
&lt;DIV&gt;[0] MPI startup(): RDMA, shared memory, and socket data transfer modes&lt;/DIV&gt;
&lt;DIV&gt;[0] MPI startup(): Intel MPI Library, Version 3.2.1 Build 20090312&lt;/DIV&gt;
&lt;DIV&gt;[0] MPI startup(): Copyright (C) 2003-2009 Intel Corporation. All rights reserved.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;so it's trying to use OpenIB-cma, but it's not definied anywhere, weird thing is - it's working but not always...&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;So - Intel MPI is not using dat.conf which dat itself is using?&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;I'll also try to link dat.conf from /etc/ofed to /etc&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Fri, 19 Mar 2010 10:05:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874764#M1836</guid>
      <dc:creator>Rafał_Błaszczyk</dc:creator>
      <dc:date>2010-03-19T10:05:18Z</dc:date>
    </item>
    <item>
      <title>MPI [Errno 13] Permission denied</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874765#M1837</link>
      <description>&amp;gt; I'll also try to link dat.conf from /etc/ofed to /etc
&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;nope, linking won't work :/&lt;/SPAN&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Fri, 19 Mar 2010 10:45:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874765#M1837</guid>
      <dc:creator>Rafał_Błaszczyk</dc:creator>
      <dc:date>2010-03-19T10:45:01Z</dc:date>
    </item>
    <item>
      <title>MPI [Errno 13] Permission denied</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874766#M1838</link>
      <description>&lt;META http-equiv="content-type" content="text/html; charset=utf-8" /&gt;
&lt;DIV&gt;I realized that IntelMPI in RHEL 5.4 is not using dapl - it's using compat-dapl which has different dat.conf (don't ask me why):&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV id="_mcePaste"&gt;# cat /etc/ofed/compat-dapl/dat.conf&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;OpenIB-cma u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib0 0" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;OpenIB-cma-1 u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib1 0" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;OpenIB-mthca0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mthca0 1" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;OpenIB-mthca0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mthca0 2" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;OpenIB-mlx4_0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mlx4_0 1" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;OpenIB-mlx4_0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mlx4_0 2" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;OpenIB-ipath0-1 u1.2 nonthreadsafe default libdaplscm.so.2 dapl.1.2 "ipath0 1" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;OpenIB-ipath0-2 u1.2 nonthreadsafe default libdaplscm.so.2 dapl.1.2 "ipath0 2" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;OpenIB-ehca0-2 u1.2 nonthreadsafe default libdaplscm.so.2 dapl.1.2 "ehca0 1" ""&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;OpenIB-iwarp u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "eth2 0" ""&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;I've used&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;OpenIB-mlx4_0-1 in I_MPI_DEVICE and it runs ok for now - I'm waiting if this error will appear again.&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;Do you think this is what you wanted me to do?&lt;/SPAN&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 19 Mar 2010 12:09:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874766#M1838</guid>
      <dc:creator>Rafał_Błaszczyk</dc:creator>
      <dc:date>2010-03-19T12:09:33Z</dc:date>
    </item>
    <item>
      <title>MPI [Errno 13] Permission denied</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874767#M1839</link>
      <description>Perhaps the following link may be helpful to understand about DAPL providers.&lt;BR /&gt;&lt;BR /&gt;I would recommend not having invalid DAPL entries onyour dat.conf file, you may want to only offer to your cluster users those which are fully functional.&lt;BR /&gt;&lt;BR /&gt;
&lt;P&gt;&lt;A href="http://software.intel.com/en-us/articles/intel-mpi-library-for-linux-experience-with-various-interconnects-and-dapl-providers/"&gt;http://software.intel.com/en-us/articles/intel-mpi-library-for-linux-experience-with-various-interconnects-and-dapl-providers/&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Mar 2010 13:36:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874767#M1839</guid>
      <dc:creator>Andres_M_Intel4</dc:creator>
      <dc:date>2010-03-19T13:36:26Z</dc:date>
    </item>
    <item>
      <title>MPI [Errno 13] Permission denied</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874768#M1840</link>
      <description>&lt;DIV id="_mcePaste"&gt;Hi, thanks. I've already read that.&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;The problem was the entries were not completely bad - I was using just wrong names (from other dat.conf) but what you wanted is to use fixed name for DAPL provider, right?&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Could that help in solving this issue?:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[94] MPI startup(): DAPL provider OpenIB-cma specified in DAPL configuration file&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[cli_94]: got unexpected response to get :cmd=get kvsname=kvs_wn3_49596_0_0 key=DAPL_MISMATCH&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[cli_94]: got unexpected response to put :cmd=put kvsname=kvs_wn3_49596_0_0 key=P94-businesscard&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;value=rdma_port#18839$rdma_host#2:0:0:192:168:20:14:0:0:0:0:0:0:0:0$&lt;/DIV&gt;
[cli_94]: got unexpected response to get :cmd=get kvsname=kvs_wn3_49596_0_0 key=DAPL_MISMATCH:[cli_94]: got unexpected response to put :cmd=put kvsname=kvs_wn3_49596_0_0 key=P94-businesscardvalue=rdma_port#18839$rdma_host#2:0:0:192:168:20:14:0:0:0:0:0:0:0:0$
&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;It's an Intel MPI message, could you explain to me what does it mean? I cannot find any docs about it.&lt;/DIV&gt;
&lt;DIV&gt;It looks like DAPL provider has been chosen (it was the same when it was running fine).&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Fri, 19 Mar 2010 13:57:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874768#M1840</guid>
      <dc:creator>Rafał_Błaszczyk</dc:creator>
      <dc:date>2010-03-19T13:57:21Z</dc:date>
    </item>
    <item>
      <title>MPI [Errno 13] Permission denied</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874769#M1841</link>
      <description>&lt;DIV id="_mcePaste"&gt;after changing I_MPI_DEVICE toOpenIB-mlx4_0-1, I've got&lt;/DIV&gt;
&lt;META content="text/html; charset=utf-8" http-equiv="content-type" /&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;B&gt;from wn3 from RANK0 process:&lt;/B&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;
&lt;DIV id="_mcePaste"&gt;[0] MPI startup(): DAPL provider OpenIB-mlx4_0-1&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[cli_0]: got unexpected response to get :cmd=get kvsname=kvs_wn3_37604_0_0 key=DAPL_MISMATCH&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[cli_0]: got unexpected response to put :cmd=put kvsname=kvs_wn3_37604_0_0 key=shm_name value=2D1921C52957AD9B5645EBCD4BA371D0&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[0] MPI startup(): Intel MPI Library, Version 3.2.1 Build 20090312&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[0] MPI startup(): Copyright (C) 2003-2009 Intel Corporation. All rights reserved.&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[cli_0]: aborting job:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIR_Init_thread(283)....: Initialization failed&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIDD_Init(98)...........: channel initialization failed&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIDI_CH3_Init(319)......:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIDI_CH3U_Init_sshm(239): PMI_KVS_Put returned -1&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;(unknown)(): &lt;NULL&gt;&lt;/NULL&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;B&gt;from other processes:&lt;/B&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;
&lt;DIV id="_mcePaste"&gt;[14] MPI startup(): DAPL provider OpenIB-mlx4_0-1&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[cli_14]: got unexpected response to get :cmd=get kvsname=kvs_wn3_37604_0_0 key=DAPL_MISMATCH&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[cli_14]: PMIU_parse_keyvals: unexpected key delimiter at character 1 in !&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[cli_14]: expecting cmd=barrier_out, got !&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;[cli_14]: aborting job:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;Fatal error in MPI_Init: Other MPI error, error stack:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIR_Init_thread(283)....: Initialization failed&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIDD_Init(98)...........: channel initialization failed&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIDI_CH3_Init(319)......:&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;MPIDI_CH3U_Init_sshm(257): PMI_Barrier returned -1&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;(unknown)(): &lt;NULL&gt;&lt;/NULL&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 19 Mar 2010 14:53:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874769#M1841</guid>
      <dc:creator>Rafał_Błaszczyk</dc:creator>
      <dc:date>2010-03-19T14:53:42Z</dc:date>
    </item>
    <item>
      <title>[solved] random issues with MPI + DAPL initialization</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874770#M1842</link>
      <description>&lt;DIV&gt;&lt;B&gt;I've found a solution to my problem.&lt;/B&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;B&gt;&lt;BR /&gt;&lt;/B&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;I believe it was the same problem as described here:&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;A href="http://software.intel.com/en-us/articles/random-fabric-errors-on-rhel5U4/" target="_blank"&gt;http://software.intel.com/en-us/articles/random-fabric-errors-on-rhel5U4/&lt;/A&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;(workaround I_MPI_RDMA_CREATE_CONN_QUAL = 0seemed to work too)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;META http-equiv="content-type" content="text/html; charset=utf-8" /&gt;After upgrading to OFED 1.5 with new DAPL the problem was finally solved.&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;DAPL version from RedHat 5.4 seems buggy.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="font-family: Verdana, Arial, Helvetica, sans-serif;"&gt;BTW: If anyone knows why RedHat decided to have two separate dat.conf files for each dapl version (1 and 2) please give me a note.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;I havesuccessfullyused new UCM interface (v2) with ConnectX (ofa-v2-mlx4_0-1u in dat.conf) which seems to be much much faster with many-core jobs than the old CMA provider.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;I believe that when sticking to RH provided OFED it's good to have one common dat.conf (DAT_OVERRIDE) with providers from DAPL1 and DAPL2.&lt;/DIV&gt;</description>
      <pubDate>Tue, 06 Apr 2010 22:03:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874770#M1842</guid>
      <dc:creator>Rafał_Błaszczyk</dc:creator>
      <dc:date>2010-04-06T22:03:13Z</dc:date>
    </item>
    <item>
      <title>[solved] random issues with MPI + DAPL initialization</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874771#M1843</link>
      <description>Rafal, thanks for sharing this information.&lt;BR /&gt;</description>
      <pubDate>Wed, 07 Apr 2010 10:43:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/solved-random-problems-with-MPI-DAPL-initialization-in-RedHat-5/m-p/874771#M1843</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2010-04-07T10:43:49Z</dc:date>
    </item>
  </channel>
</rss>

