Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.
2019 Discussions

IMPI oversubscribing CPUs to ranks

Tim_Pook
Beginner
409 Views

 

Context:

Running job via PBS Pro batch scheduler on compute node with 128 cores.

When requesting 64 cores for the job, only 32 cores are used ( found via htop )

When requestion 128 cores for the same job, it uses all 128 cores.

No hyperthreading.

64 core job:

Pins cpu 0 to rank 0 and 32, cpu 1 to rank 1 and 33 etc. Thus, cpu's 32-63 are ignored.

64 cores - MPI debug64 cores - MPI debug

These are the other enabled environment variables:

Other envvarsOther envvars

 

So far I've fixed this with I_MPI_HYDRA_TOPOLIB=ipl, but this also causes other issues when trying to run jobs over InfiniBand so isn't ideal. Also, the pinning behaviour isn't desirable as shown in screenshot below.

Screenshot 2021-09-15 at 12.51.33 PM.png

 

Any advise on how to enforce proper process pinning would be very helpful.

 

0 Kudos
1 Solution
Tim_Pook
Beginner
383 Views

Manage to fix this problem.

 

Though it was an issue with MPI and it only occured when we updated from intel mpi 2018 to intel mpi 2019, but was actually caused by PBS / cgroups, where scheduler was forcing the job to only use 32 cores.

 

 

View solution in original post

2 Replies
Tim_Pook
Beginner
384 Views

Manage to fix this problem.

 

Though it was an issue with MPI and it only occured when we updated from intel mpi 2018 to intel mpi 2019, but was actually caused by PBS / cgroups, where scheduler was forcing the job to only use 32 cores.

 

 

ShivaniK_Intel
Moderator
371 Views

Hi,

 

Thanks for reaching out to us.

 

Glad to know that your issue is resolved. Thanks for sharing the solution with us. If you need any additional information,  please post a new question as this thread will no longer be monitored by Intel.

 

Thanks & Regards

Shivani

 

Reply