Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2161 Discussions

Intel MPI with IMB-MPI1 on all the nodes produces reg_mr Cannot allocate memory

Guillaume_De_Nayer
887 Views
Hi,

we have a little cluster with 8 nodes (each one 12 cores). We have 2 blades. In one blade there are 4 nodes. All these nodes are connected with infiniband.

Intel MPI ist installed and configured with shm:ofa.

I'm starting the following test on all the cores of the cluster:
mpirun -np 96 IMB-MPI1

It generates "normal" results for all the sub-tests. But there is a problem with:
#----------------------------------------------------------------
# Benchmarking Alltoall
# #processes = 96
#----------------------------------------------------------------

it gives:
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.11 0.15 0.12
1 1000 42.13 42.15 42.14
2 1000 43.61 43.62 43.62
4 1000 52.55 52.57 52.56
8 1000 62.75 62.78 62.77
16 1000 68.49 68.52 68.50
32 1000 80.11 80.13 80.12
64 1000 111.07 111.10 111.09
128 1000 181.19 181.25 181.23
256 1000 368.36 368.52 368.44
512 1000 328.78 328.83 328.80
1024 1000 602.03 603.65 602.17
2048 1000 5873.23 5873.65 5873.45
4096 1000 6000.28 6000.59 6000.43
8192 1000 6965.62 6965.84 6965.75
16384 943 10429.38 10429.66 10429.52
32768 400 25244.62 25245.83 25245.13
65536 223 44969.48 44972.04 44970.70
131072 118 84991.07 84997.68 84994.67
262144 60 167439.02 167466.40 167451.96
524288 31 330707.68 330769.06 330739.70
1048576 16 658785.06 659147.81 658966.23
2097152 8 1314571.62 1315755.52 1315313.50
n08:3914: reg_mr Cannot allocate memory
n08:3914: reg_mr Cannot allocate memory
n08:3915: reg_mr Cannot allocate memory
...

I'm seeing these "reg_mr Cannot allocate memory" for all the nodes...

What is exactly this problem and how can I solve it ?

Thx a lot!
Best regards
0 Kudos
4 Replies
Dmitry_K_Intel2
Employee
887 Views
Hi Guillaume,

You are probably using Mellanox HCAs. This message usually means that there is not enough memory for buffers. It depends on how much memory you have on a node. Alltoall requires a lot of memory for internal buffers and you just need to limit max size of the messages for IMB.

You can also try the following trick: add the following line to the /etc/modprobe.conf:
options mlx4_core log_mtts_per_seg=5

It should reduce memory consumed by communication functions.

Regards!
Dmitry

add the following line to /etc/modprobe.conf:

options mlx4_core log_mtts_per_seg=5
0 Kudos
Guillaume_De_Nayer
887 Views
Hi!

Thx for your useful answer. I will try your ideas! But where can I find how limit the size of the message for IMB. I had the idea, but I couldn't find how...I'm too stupid to google correctly...

Best regards!
Guillaume
0 Kudos
Andres_M_Intel4
Employee
887 Views
You need to provide a file with the explicit list of message lengths to include. I think the default behavior is to include all of them if no file is provided.
$ ./IMB-MPI1 -h
...
- msglen
the argument after -msglen is a lengths_file, an ASCII file, containing any set of nonnegative
message lengths, 1 per line
...
For instance, Intel Cluster Checker use the following list of msglen values to get a quick but still representative sample of results.
$ cat IMB_msglen
0
1
2
4
4194304
Note that you usually get best latency with a zero payload, and the best bandwidth with a really big payload.
As usual a would recommend some experimentation to optimize those values.
0 Kudos
Guillaume_De_Nayer
887 Views
Hi!

Great! thx a lot!
0 Kudos
Reply