Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
FPGA community forums and blogs have moved to the Altera Community. Existing Intel Community members can sign in with their current credentials.

scalapack memory loss

mp_def
Beginner
5,787 Views

I have a code that solves large systems of equations in parallel using scalapack's PZGETRS function. The code fails for some cases due to an apparent lack of memory. I traced the memory loss to PZGETRS using the command "free -m" after every function call to monitor available memory on the compute nodes I'm using. For a matrix size of ~6000x6000, with a 7x7 process grid, I lose ~7 GB of available memory for every PZGETRS call. 

 

To fix this, I tried setting the env variable MKL_DISABLE_FAST_MM=1, calling mkl_disable_fast_mm() in the script, and compiling with different version of intel (I have access to 2020.1.217, 2019.5.281, and 2018.5.274). No change in the behavior. I also used the mkl_service module to try the mkl_free_buffers() command, and also attempted to measure peak memory using mkl_peak_mem_usage. The free_buffers didn't do anything, and the reported peak memory was ~11 MB. I'm having trouble reconciling that reported memory use with the apparent loss I see through the "free" command. I'm hoping this is an error either in my use of scalapack or my compilation, but if it is, I cannot figure it out.

 

I recreated the issue with a small test script, attached. I tested the script on my desktop, where I use openmpi and a local version of scalapack. For a matrix of size 6200, with 16 tasks (4x4 grid), my local code appears to lose 9 MB. On the cluster I'm using, where I compiled with impi and intel mkl, I lose 3648 MB with 16 tasks, and 7297 MB with 49 tasks.

 

For the attached Makefiles for my working example code, I renamed them to be .txt just to upload. Remove the .txt from both to make them run properly. The Makefile shows both ways I compile the code. I switch between them using the compile_link variable in Makefile.inc. Option 2 is my local install, option 3 is for the cluster using intel.

 

Labels (2)
0 Kudos
10 Replies
SantoshY_Intel
Moderator
5,754 Views

Hi,

 

Thanks for posting in the Intel forums.

 

>>"I tested the script on my desktop, where I use openmpi and a local version of scalapack. "

Could you please confirm if you are using an open-source version of ScaLAPACK which is not a part of Intel MKL?

 

Could you please try the combination of Intel MKL & OpenMPI and let us know your observations? i.e How much memory is being used using this combination?

 

Thanks & Regards,

Santosh

 

0 Kudos
mp_def
Beginner
5,742 Views

On my desktop scalapack is open source version 2.0.0. 

 

On the cluster I tried OpenMPI + MKL:

  • 16 tasks --> 141 MB lost in solve
  • 49 tasks --> 211 MB lost in solve

 

The openMPI in both cases is 3.0.0

0 Kudos
SantoshY_Intel
Moderator
5,514 Views

Hi,


Thanks for providing the details.


Could you please try the combination of opensource scaLAPACK & Intel MPI and let us know the memory being used?


Thanks & Regards,

Santosh


0 Kudos
mp_def
Beginner
5,501 Views

I compiled with Intel MPI and opensource scalapack. Specifically, I recompiled my scalapack software using the impi compilers on the cluster. I also linked the scalapack build with the blas, lapack contained in MKL.

  • 16 tasks --> 3624 MB lost
  • 49 tasks --> 7341 MB lost

To eliminate all MKL, I recompiled scalapack using the blas/lapack in openblas (an older version, 0.2.20). To be clear, I compiled openblas using gcc/gfortran. I did this because of a little note in an openblas file to not use intel compilers. I compiled Scalapack using intel compilers.

  • 16 tasks --> 2118 MB lost
  • 49 tasks --> 3765 MB lost.

I then upgraded my openblas to v0.3.21.dev, and decided to compile openblas with intel compilers despite the message saying not to.

  • 16 tasks --> 2116 MB lost in solve
  • 49 tasks --> 3745 MB lost in solve.
0 Kudos
SantoshY_Intel
Moderator
5,435 Views

Hi,


Thanks for providing all the details.


We are working on your issue & we will get back to you soon.


Thanks & Regards,

Santosh


0 Kudos
SantoshY_Intel
Moderator
5,369 Views

Hi,

 

I tried to reproduce your issue from my end and below are my observations:

OpenMPI + MKL :

Steps:  

  1. mpif90 -c working_ex.f90 -o bin/working_ex.o
  2. mpif90 -g -fopenmp -Oo -debug bin/working_ex.o -L${MKLROOT}/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_ilp64 -liomp5 -lpthread -lm -ldl -o main
  3. mpirun -n 1 ./main

Observations: Getting an error as shown in the attachment(OpenMPI.debug)

 

Intel MPI + MKL :

Steps:

  1.  source /opt/intel/oneapi/mpi/latest/env/vars.sh
  2.  source /opt/intel/oneapi/compiler/latest/env/vars.sh
  3.  mpiifort -c working_ex.f90 -o bin/working_ex.o
  4.  mpiifort -O0 -g -debug -qmkl=cluster -recursive bin/working_ex.o -L${MKLROOT}/lib/intel64 -lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_ilp64 -liomp5 -lpthread -lm -ldl -o main
  5. mpirun -n 16 ./main

Observations:

MicrosoftTeams-image (11).png

 

Could you please let me know if there is anything that I missed or went wrong while trying the OpenMPI+MKL combination?

 

Thanks,

Santosh

 

0 Kudos
SantoshY_Intel
Moderator
5,279 Views

Hi,


Could you please let me know if there is anything that I missed or went wrong while trying the OpenMPI+MKL combination?


Thanks,

Santosh



0 Kudos
mp_def
Beginner
5,213 Views

Hi. I’m looking at this and will get back to you soon. 

0 Kudos
SantoshY_Intel
Moderator
5,135 Views

Hi,

 

>>>"I’m looking at this and will get back to you soon."

Could you please provide us with an update on this issue?

 

Thanks & Regards,

Santosh

 

0 Kudos
SantoshY_Intel
Moderator
5,054 Views

Hi,


We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.


Thanks & Regards,

Santosh


0 Kudos
Reply