Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2293 Discussions

Core count limitations for the shared memory transport

drmiket7777
New Contributor I
595 Views

I was wondering if the shared memory transport in the latest IntelMPI has any core count limitation. I was trying to run on an Azure HBv5 (MI300C0) node with 368 cores but it crashes while doing the

MPI_Init

.

 

Thanks

Michael  

0 Kudos
3 Replies
Sergey_K_Intel3
Employee
581 Views

Please check your number of open files ulimit settings.

Run "ulimit -a" to report all current settings or "ulimit -n" just to report the number of open files limit.

Run "ulimit -Sn <number>" to set a new limit.

Intel MPI uses around 3 file descriptors for each rank.

0 Kudos
drmiket7777
New Contributor I
578 Views

I see, thanks.

There is  no other build-in resource limitation in the code per core besides the number of file descriptors, right?

Thanks

0 Kudos
drmiket7777
New Contributor I
542 Views

Our ulimit (soft limits are the same) is 


$ ulimit -Ha
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 442544
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 32768
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Can you think of any other resource that could become sparse when we use IntelMPI on very high core count nodes? We are using nodes with 368 cores each.

0 Kudos
Reply