- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I 've compiled the HPCC benchmark suite (http://icl.cs.utk.edu/hpcc/) with Intel MPI, but am facing the following run-time problem:
[bart@head 2x8]$ /share/intel/impi/3.2.1.009/bin64/mpirun -f 1.nodelist -n 16 -r ssh ./hpcc
node002:27686: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27684: reg_mr Cannot allocate memory
node002:27679: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27686: reg_mr Cannot allocate memory
node001:2834: reg_mr Cannot allocate memory
node002:27679: reg_mr Cannot allocate memory
node001:2834: reg_mr Cannot allocate memory
node002:27686: reg_mr Cannot allocate memory
node001:2839: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27679: reg_mr Cannot allocate memory
node002:27681: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27681: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node002:27685: reg_mr Cannot allocate memory
node002:27685: reg_mr Cannot allocate memory
node002:27680: reg_mr Cannot allocate memory
node001:2833: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node002:27680: reg_mr Cannot allocate memory
node001:2833: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27684: reg_mr Cannot allocate memory
node001:2839: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27684: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
register failed 196608 [10] error(0x30000): OpenIB-cma: DAT_INSUFFICIENT_RESOURCES:
node001:2835: reg_mr Cannot allocate memory
[4:node002][rdma_iba.c:220] Intel MPI fatal error: DTO operation posted for [10:node001] completed with error. status=0x1. cookie=0x4000a
rank 10 in job 1 head_46465 caused collective abort of all ranks
exit status of rank 10: return code 1
The benchmark fails at the start of the HPL part of the benchmark. Any suggestions for fixes would be most appreciated.
Thanks,
Bart
I 've compiled the HPCC benchmark suite (http://icl.cs.utk.edu/hpcc/) with Intel MPI, but am facing the following run-time problem:
[bart@head 2x8]$ /share/intel/impi/3.2.1.009/bin64/mpirun -f 1.nodelist -n 16 -r ssh ./hpcc
node002:27686: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27684: reg_mr Cannot allocate memory
node002:27679: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27686: reg_mr Cannot allocate memory
node001:2834: reg_mr Cannot allocate memory
node002:27679: reg_mr Cannot allocate memory
node001:2834: reg_mr Cannot allocate memory
node002:27686: reg_mr Cannot allocate memory
node001:2839: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27679: reg_mr Cannot allocate memory
node002:27681: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27681: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node002:27685: reg_mr Cannot allocate memory
node002:27685: reg_mr Cannot allocate memory
node002:27680: reg_mr Cannot allocate memory
node001:2833: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node002:27680: reg_mr Cannot allocate memory
node001:2833: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27684: reg_mr Cannot allocate memory
node001:2839: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27684: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
register failed 196608 [10] error(0x30000): OpenIB-cma: DAT_INSUFFICIENT_RESOURCES:
node001:2835: reg_mr Cannot allocate memory
[4:node002][rdma_iba.c:220] Intel MPI fatal error: DTO operation posted for [10:node001] completed with error. status=0x1. cookie=0x4000a
rank 10 in job 1 head_46465 caused collective abort of all ranks
exit status of rank 10: return code 1
The benchmark fails at the start of the HPL part of the benchmark. Any suggestions for fixes would be most appreciated.
Thanks,
Bart
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Correction: it fails during the PTRANS part of the benchmark.
Bart
Bart
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - bwillems
Correction: it fails during the PTRANS part of the benchmark.
Bart
Bart
Hi Bart,
It's very strange to see that you cannot start HPCC testing because this is quite standard testing. Have you read the article: http://software.intel.com/en-us/articles/performance-tools-for-software-developers-use-of-intel-mkl-in-hpcc-benchmark/ I hope it will be useful.
Might be your system has not enough memory for 16 processes. Have you tried less?
Could you provide details about your cluster?
Best wishes,
Dmitry
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Bart
Please try mpirun -nolocal option because Intel MPI starts processes on local host by default.
Sergey
Please try mpirun -nolocal option because Intel MPI starts processes on local host by default.
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
FYI
The following link has an http form to build your parm file (the HPL.dat file). It attempts to build a parm file that will maximize node usage to obtain best FLOP score possible. But, it's also nice in that it builds a parm file that will use the number nodes/cores you want, without having to understand the format of the parm file.
http://lab.advancedclustering.com/hpl.html
The following link has an http form to build your parm file (the HPL.dat file). It attempts to build a parm file that will maximize node usage to obtain best FLOP score possible. But, it's also nice in that it builds a parm file that will use the number nodes/cores you want, without having to understand the format of the parm file.
http://lab.advancedclustering.com/hpl.html

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page