Intel® HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

mpitune question


I am trying to use MPITUNE to tune runs on my new cluster.  The cluster runs under torque and has Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz cpus with 16 cores.  So I run mpitune using 96 cpus under torque (#PBS -l nodes=96).  It then generates the following tuning files :

mpiexec_shm-ofa_nn_1_np_16_ppn_16.conf mpiexec_shm:ofa_nn_1_np_16_ppn_16.conf mpiexec_shm-ofa_nn_1_np_2_ppn_2.conf

mpiexec_shm:ofa_nn_1_np_2_ppn_2.conf mpiexec_shm-ofa_nn_1_np_4_ppn_4.conf mpiexec_shm:ofa_nn_1_np_4_ppn_4.conf

mpiexec_shm-ofa_nn_1_np_8_ppn_8.conf mpiexec_shm:ofa_nn_1_np_8_ppn_8.conf mpiexec_shm-ofa_nn_2_np_16_ppn_8.conf

mpiexec_shm:ofa_nn_2_np_16_ppn_8.conf mpiexec_shm-ofa_nn_2_np_2_ppn_1.conf mpiexec_shm:ofa_nn_2_np_2_ppn_1.conf

mpiexec_shm-ofa_nn_2_np_32_ppn_16.conf mpiexec_shm:ofa_nn_2_np_32_ppn_16.conf mpiexec_shm-ofa_nn_2_np_4_ppn_2.conf mpiexec_shm:ofa_nn_2_np_4_ppn_2.conf mpiexec_shm-ofa_nn_2_np_8_ppn_4.conf mpiexec_shm:ofa_nn_2_np_8_ppn_4.conf

mpiexec_shm-ofa_nn_4_np_16_ppn_4.conf mpiexec_shm:ofa_nn_4_np_16_ppn_4.conf mpiexec_shm-ofa_nn_4_np_32_ppn_8.conf mpiexec_shm:ofa_nn_4_np_32_ppn_8.conf mpiexec_shm-ofa_nn_4_np_4_ppn_1.conf mpiexec_shm:ofa_nn_4_np_4_ppn_1.conf

mpiexec_shm-ofa_nn_4_np_64_ppn_16.conf mpiexec_shm:ofa_nn_4_np_64_ppn_16.conf mpiexec_shm-ofa_nn_4_np_8_ppn_2.conf mpiexec_shm:ofa_nn_4_np_8_ppn_2.conf




When I run mpitune I add the options -hf $PBS_NODEFILE -fl shm:ofa

When I try to run a program with the -tune option using 96 cpus I get :

Value I_MPI_PERHOST="allcores" for -tune is not supported. Please specify a numerical one. I_MPI_PERHOST=1 will be used for data choosing.WARNING: there are no tuned data files appropriate for your configuration: device = shm:ofa, np = 96, ppn = 1

Any hints as to how to make this work would be appeciated.


0 Kudos
1 Reply

Hi Bernie,

There are two things to consider here.  First, using -tune will attempt to find a tuning file that matches your current run.  Using I_MPI_PERHOST=allcores will not match a tuning file, as the tuning files are for specific ppn values (the ppn_<n> portion of the filename).

Second, you do not have a tuning file for 96 processes.  The highest you have is for 64 processes.  Since none of your tuning files match your job, none will be used.  My immediate guess is that for some reason the tuning run didn't complete (or something prevented it from getting the correct input).  The log file (mpituner_1358943728.log) will have more information on this.  The host file used (mpituner_1358943728.hosts) could also be helpful.  You can look at these files, or if you want to send them to me I can take a look at them.

James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

0 Kudos