<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hi, in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/Issue-while-spawning-processes-across-multiple-nodes-EXIT-CODE-9/m-p/1124769#M5538</link>
    <description>&lt;P&gt;Hi all,&lt;BR /&gt;
	I tried running this benchmark on compute nodes via PBS, I again end up with similar error.&amp;nbsp;&lt;BR /&gt;
	Here is my job submission script:&lt;BR /&gt;
	&amp;nbsp;&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;#!/bin/bash
#PBS -N NPB_N4_TPP24
#PBS -l select=2:ncpus=24:mpiprocs=2
#PBS -q test
#PBS -o output1.txt
#PBS -e error1.txt
#PBS -P cc
cd $PBS_O_WORKDIR


export OMP_NUM_THREADS=12
module load suite/intel/parallelStudio
mpirun -np 4 -hostfile $PBS_NODEFILE -genv I_MPI_HYDRA_DEBUG=1   -genv OMP_NUM_THREADS=12 -genv I_MPI_DEBUG=5 -ppn 2 ./bt.E.4.mpi_io_full
&lt;/PRE&gt;

&lt;P&gt;.&lt;/P&gt;

&lt;P&gt;This seems to be an issue with NPB's class E problems.&lt;BR /&gt;
	I recompiled NPB for class D , and i was able to run the benchmark on multiple nodes.&lt;BR /&gt;
	&lt;BR /&gt;
	Do let me know if you are able to identify bug with class E problem(each compute nodes in my setup has 64GB RAM).&lt;/P&gt;</description>
    <pubDate>Wed, 04 Jan 2017 13:29:00 GMT</pubDate>
    <dc:creator>psing51</dc:creator>
    <dc:date>2017-01-04T13:29:00Z</dc:date>
    <item>
      <title>Issue while spawning processes across multiple nodes (EXIT CODE: 9)</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Issue-while-spawning-processes-across-multiple-nodes-EXIT-CODE-9/m-p/1124768#M5537</link>
      <description>&lt;P&gt;Hi all,&lt;BR /&gt;
	I am using intel parallel studio 2015 on Intel(R) Xeon(R) CPU E5-2680 v3 (RHEL-6.5) and currently facing issues with an mpi based application(Nas Parallel Benchmark-BT). Though the issue seems application specific, I would like to have your opinions on methodology to debug/fix issues like these .&lt;/P&gt;

&lt;P&gt;I was successful in testing the mpi setup as :-&lt;BR /&gt;
	&amp;nbsp;&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;[puneets@host01 bin]$ cat hosts.txt 
host02
host03

[puneets@host01 bin]$ mpirun -np 4 -ppn 2 -hostfile hosts.txt ./hello 
host02
host02
host03
host03&lt;/PRE&gt;

&lt;P&gt;But when i try to run the application, I end up with:-&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;[puneets@host01 bin]$ mpirun -np 4 -ppn 2 -hostfile hosts.txt ./bt.E.4.mpi_io_full 

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 25799 RUNNING AT host03
=   EXIT CODE: 9
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
APPLICATION TERMINATED WITH THE EXIT STRING: Killed (signal 9)&lt;/PRE&gt;

&lt;P&gt;I have attached verbose log of the error (VERBOSE.txt).&amp;nbsp;&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;​[puneets@host01 bin]$ mpirun -genv I_MPI_HYDRA_DEBUG=1 -hostfile hosts.txt -genv I_MPI_DEBUG=5 -np 4 -ppn 2 ./bt.E.4.mpi_io_full 

&lt;/PRE&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;Whereas, on single node, i am able to run the application as:-&lt;/SPAN&gt;&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;[puneets@host01 bin]$ mpirun -np 4  ./bt.E.4.mpi_io_full 


 NAS Parallel Benchmarks 3.3 -- BT Benchmark 

 No input file inputbt.data. Using compiled defaults
 Size: 1020x1020x1020
 Iterations:  250    dt:   0.0000040
 Number of active processes:     4

 BTIO -- FULL MPI-IO write interval:   5
&lt;/PRE&gt;

&lt;P&gt;&lt;BR /&gt;
	&lt;BR /&gt;
	I am attaching the make.def and compilation log for your reference.&lt;BR /&gt;
	Any help/Hint will be very useful. Eagerly awaiting your replies.&lt;BR /&gt;
	&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 04 Jan 2017 11:12:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Issue-while-spawning-processes-across-multiple-nodes-EXIT-CODE-9/m-p/1124768#M5537</guid>
      <dc:creator>psing51</dc:creator>
      <dc:date>2017-01-04T11:12:25Z</dc:date>
    </item>
    <item>
      <title>Hi,</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Issue-while-spawning-processes-across-multiple-nodes-EXIT-CODE-9/m-p/1124769#M5538</link>
      <description>&lt;P&gt;Hi all,&lt;BR /&gt;
	I tried running this benchmark on compute nodes via PBS, I again end up with similar error.&amp;nbsp;&lt;BR /&gt;
	Here is my job submission script:&lt;BR /&gt;
	&amp;nbsp;&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;#!/bin/bash
#PBS -N NPB_N4_TPP24
#PBS -l select=2:ncpus=24:mpiprocs=2
#PBS -q test
#PBS -o output1.txt
#PBS -e error1.txt
#PBS -P cc
cd $PBS_O_WORKDIR


export OMP_NUM_THREADS=12
module load suite/intel/parallelStudio
mpirun -np 4 -hostfile $PBS_NODEFILE -genv I_MPI_HYDRA_DEBUG=1   -genv OMP_NUM_THREADS=12 -genv I_MPI_DEBUG=5 -ppn 2 ./bt.E.4.mpi_io_full
&lt;/PRE&gt;

&lt;P&gt;.&lt;/P&gt;

&lt;P&gt;This seems to be an issue with NPB's class E problems.&lt;BR /&gt;
	I recompiled NPB for class D , and i was able to run the benchmark on multiple nodes.&lt;BR /&gt;
	&lt;BR /&gt;
	Do let me know if you are able to identify bug with class E problem(each compute nodes in my setup has 64GB RAM).&lt;/P&gt;</description>
      <pubDate>Wed, 04 Jan 2017 13:29:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Issue-while-spawning-processes-across-multiple-nodes-EXIT-CODE-9/m-p/1124769#M5538</guid>
      <dc:creator>psing51</dc:creator>
      <dc:date>2017-01-04T13:29:00Z</dc:date>
    </item>
  </channel>
</rss>

