<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Any good tools/methods to debug MPI based program? in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010753#M3903</link>
    <description>&lt;P&gt;Dear all,&lt;/P&gt;

&lt;P&gt;I have a MPI-based Fortran code that can run with single or two processes, however, when lunch the program with more processes, for example, 4 processes, the program crashed with the following message:&lt;/P&gt;

&lt;P&gt;forrtl: severe (157): Program Exception - access violation&lt;BR /&gt;
	forrtl: severe (157): Program Exception - access violation&lt;/P&gt;

&lt;P&gt;job aborted:&lt;BR /&gt;
	rank: node: exit code[: error message]&lt;BR /&gt;
	0: N01: 123&lt;BR /&gt;
	1: N01: 123&lt;BR /&gt;
	2: n02: 157: process 2 exited without calling finalize&lt;BR /&gt;
	3: n02: 157: process 3 exited without calling finalize&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I tried to add print message and mpi_barrier to trace the problem, but still failed. Is there any debug tools or methods to debug the MPI based program? The command lines I run the program is as follows:&lt;/P&gt;

&lt;P&gt;mpiexec -wdir "\\N02\Debug\directional\for_debug\mytest" -mapall -hosts 10 n01 2 n02 2 n03 2 n04 2 n05 2 n06 2 n07 2 n08 2 n09 2 n10 2 \\N02\Debug\directional\for_debug\test&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thanks,&lt;/P&gt;

&lt;P&gt;Zhanghong Tang&lt;/P&gt;</description>
    <pubDate>Fri, 09 Oct 2015 01:58:34 GMT</pubDate>
    <dc:creator>Zhanghong_T_</dc:creator>
    <dc:date>2015-10-09T01:58:34Z</dc:date>
    <item>
      <title>Any good tools/methods to debug MPI based program?</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010753#M3903</link>
      <description>&lt;P&gt;Dear all,&lt;/P&gt;

&lt;P&gt;I have a MPI-based Fortran code that can run with single or two processes, however, when lunch the program with more processes, for example, 4 processes, the program crashed with the following message:&lt;/P&gt;

&lt;P&gt;forrtl: severe (157): Program Exception - access violation&lt;BR /&gt;
	forrtl: severe (157): Program Exception - access violation&lt;/P&gt;

&lt;P&gt;job aborted:&lt;BR /&gt;
	rank: node: exit code[: error message]&lt;BR /&gt;
	0: N01: 123&lt;BR /&gt;
	1: N01: 123&lt;BR /&gt;
	2: n02: 157: process 2 exited without calling finalize&lt;BR /&gt;
	3: n02: 157: process 3 exited without calling finalize&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I tried to add print message and mpi_barrier to trace the problem, but still failed. Is there any debug tools or methods to debug the MPI based program? The command lines I run the program is as follows:&lt;/P&gt;

&lt;P&gt;mpiexec -wdir "\\N02\Debug\directional\for_debug\mytest" -mapall -hosts 10 n01 2 n02 2 n03 2 n04 2 n05 2 n06 2 n07 2 n08 2 n09 2 n10 2 \\N02\Debug\directional\for_debug\test&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thanks,&lt;/P&gt;

&lt;P&gt;Zhanghong Tang&lt;/P&gt;</description>
      <pubDate>Fri, 09 Oct 2015 01:58:34 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010753#M3903</guid>
      <dc:creator>Zhanghong_T_</dc:creator>
      <dc:date>2015-10-09T01:58:34Z</dc:date>
    </item>
    <item>
      <title>Further check I found that</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010754#M3904</link>
      <description>&lt;P&gt;Further check I found that when I run the mpiexec on another host instead of n01, for example, n10, the program works, or if I run the mpiexec on n01, but the command line is as follows:&lt;/P&gt;

&lt;P&gt;mpiexec -wdir "\\N02\Debug\directional\for_debug\mytest" -mapall -hosts 10 n02 2 n01 2 n03 2 n04 2 n05 2 n06 2 n07 2 n08 2 n09 2 n10 2 \\N02\Debug\directional\for_debug\test&lt;/P&gt;

&lt;P&gt;The program also works. So it seems that the the problem is related to myid=0, but all hosts used the same work folder, could anyone help me to take a look at it?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Fri, 09 Oct 2015 03:16:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010754#M3904</guid>
      <dc:creator>Zhanghong_T_</dc:creator>
      <dc:date>2015-10-09T03:16:27Z</dc:date>
    </item>
    <item>
      <title>Hi Zhanghong,</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010755#M3905</link>
      <description>&lt;P&gt;Hi Zhanghong,&lt;/P&gt;

&lt;P&gt;in your case, you should be able to use a core dump to check what's the problem.&lt;/P&gt;

&lt;P&gt;More in general, besides the commercial debuggers for parallel applications, there are some free tools that I use regularly to debug MPI-programs:&lt;/P&gt;

&lt;OL&gt;
	&lt;LI&gt;strace (from version 4.9): you can get a strack trace of your program at a specific system call with the option -k. Enable it for the function 'exit_group':&lt;BR /&gt;
		&lt;BR /&gt;
		strace -k -eexit_group -ostrace.out [my_application]&lt;BR /&gt;
		&lt;BR /&gt;
		and it should give you a backtrace at the moment that an MPI-application stops. This is useful if your application stops gracefully (so no core dump), but doesn't tell you where or why it stopped.&lt;BR /&gt;
		&amp;nbsp;&lt;/LI&gt;
	&lt;LI&gt;padb: &lt;A href="http://padb.pittman.org.uk/" target="_blank"&gt;http://padb.pittman.org.uk/&lt;/A&gt;. It gives you a 'unified' backtrace of all running MPI-processes. This is especially useful if your MPI-application hangs.&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Fri, 09 Oct 2015 13:38:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010755#M3905</guid>
      <dc:creator>John_D_6</dc:creator>
      <dc:date>2015-10-09T13:38:16Z</dc:date>
    </item>
    <item>
      <title>Dear Dr John,</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010756#M3906</link>
      <description>&lt;P&gt;Dear Dr John,&lt;/P&gt;

&lt;P&gt;Thank you very much for your kindly reply. I work on Windows 7 system, I don't know whether these two tools you recommended could work on Windows system or not.&lt;/P&gt;

&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Sat, 10 Oct 2015 01:49:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010756#M3906</guid>
      <dc:creator>Zhanghong_T_</dc:creator>
      <dc:date>2015-10-10T01:49:25Z</dc:date>
    </item>
    <item>
      <title>Hi</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010757#M3907</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;

&lt;P&gt;It's unclear whether the code or how you have implemented MPI is the cause. You should be open minded to either.&lt;/P&gt;

&lt;P&gt;If your preferred (serial) debugger is Visual Studio (VS), then you can use this to help with debugging.&amp;nbsp;&lt;BR /&gt;
	Presuming you have already integrated your MPI implentation with VS, then launching&lt;/P&gt;

&lt;P&gt;&amp;nbsp;mpiexec -n 4 full_VS_Executable_name full_MPI_Executable name&lt;/P&gt;

&lt;P&gt;should start 4 instances of VS each running one MPI process. Start each process, one by one in each VS instance, and then off you go.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Yours, Michael&lt;/P&gt;

&lt;P&gt;&lt;A href="http://highendcompute.co.uk"&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;A href="http://highendcompute.co.uk" target="_blank"&gt;http://highendcompute.co.uk&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;@highendcompute&lt;/P&gt;</description>
      <pubDate>Sun, 11 Oct 2015 14:15:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010757#M3907</guid>
      <dc:creator>high_end_c_</dc:creator>
      <dc:date>2015-10-11T14:15:41Z</dc:date>
    </item>
    <item>
      <title>ah, I see. Indeed these tools</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010758#M3908</link>
      <description>&lt;P&gt;ah, I see. Indeed these tools are available on linux- and unix-based systems, so I'm afraid these will not help you. Unless you'd migrate OS, of course.&lt;/P&gt;</description>
      <pubDate>Sun, 11 Oct 2015 14:31:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010758#M3908</guid>
      <dc:creator>John_D_6</dc:creator>
      <dc:date>2015-10-11T14:31:36Z</dc:date>
    </item>
    <item>
      <title>Hi Zhanghong,</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010759#M3909</link>
      <description>&lt;P&gt;Hi Zhanghong,&lt;/P&gt;

&lt;P&gt;You may try to attach to the problem MPI process with WinDbg.&lt;/P&gt;</description>
      <pubDate>Mon, 12 Oct 2015 07:07:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Any-good-tools-methods-to-debug-MPI-based-program/m-p/1010759#M3909</guid>
      <dc:creator>Artem_R_Intel1</dc:creator>
      <dc:date>2015-10-12T07:07:07Z</dc:date>
    </item>
  </channel>
</rss>

