Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
33 Views

HPCC benchmark failing with Intel MPI

Hello,

I 've compiled the HPCC benchmark suite (http://icl.cs.utk.edu/hpcc/) with Intel MPI, but am facing the following run-time problem:

[bart@head 2x8]$ /share/intel/impi/3.2.1.009/bin64/mpirun -f 1.nodelist -n 16 -r ssh ./hpcc
node002:27686: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27684: reg_mr Cannot allocate memory
node002:27679: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27686: reg_mr Cannot allocate memory
node001:2834: reg_mr Cannot allocate memory
node002:27679: reg_mr Cannot allocate memory
node001:2834: reg_mr Cannot allocate memory
node002:27686: reg_mr Cannot allocate memory
node001:2839: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27679: reg_mr Cannot allocate memory
node002:27681: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27681: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node002:27685: reg_mr Cannot allocate memory
node002:27685: reg_mr Cannot allocate memory
node002:27680: reg_mr Cannot allocate memory
node001:2833: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node002:27680: reg_mr Cannot allocate memory
node001:2833: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27684: reg_mr Cannot allocate memory
node001:2839: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27684: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
register failed 196608 [10] error(0x30000): OpenIB-cma: DAT_INSUFFICIENT_RESOURCES:

node001:2835: reg_mr Cannot allocate memory
[4:node002][rdma_iba.c:220] Intel MPI fatal error: DTO operation posted for [10:node001] completed with error. status=0x1. cookie=0x4000a
rank 10 in job 1 head_46465 caused collective abort of all ranks
exit status of rank 10: return code 1


The benchmark fails at the start of the HPL part of the benchmark. Any suggestions for fixes would be most appreciated.

Thanks,
Bart
0 Kudos
4 Replies
Highlighted
Beginner
33 Views

Correction: it fails during the PTRANS part of the benchmark.

Bart
0 Kudos
Highlighted
33 Views

Quoting - bwillems
Correction: it fails during the PTRANS part of the benchmark.

Bart

Hi Bart,

It's very strange to see that you cannot start HPCC testing because this is quite standard testing. Have you read the article: http://software.intel.com/en-us/articles/performance-tools-for-software-developers-use-of-intel-mkl-... I hope it will be useful.

Might be your system has not enough memory for 16 processes. Have you tried less?
Could you provide details about your cluster?

Best wishes,
Dmitry
0 Kudos
Highlighted
33 Views

Bart
Please try mpirun -nolocal option because Intel MPI starts processes on local host by default.

Sergey
0 Kudos
Highlighted
Beginner
33 Views

FYI
The following link has an http form to build your parm file (the HPL.dat file). It attempts to build a parm file that will maximize node usage to obtain best FLOP score possible. But, it's also nice in that it builds a parm file that will use the number nodes/cores you want, without having to understand the format of the parm file.


http://lab.advancedclustering.com/hpl.html


0 Kudos