Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2153 Discussions

openmp memory leak when using g++ and libiomp5

may_ka
Beginner
1,899 Views

Hi,

I found this code:

#include <iostream>
#include <vector>
template<typename T>
void worknested1(std::vector<T> &x){
#if defined(_OPENMP)
#pragma omp parallel for num_threads(1)
#endif
  for(int j=0;j<x.size();++j){
    x[j]=(T)0;
  }
}
template<typename T>
void worknested0(){
#if defined(_OPENMP)
#pragma omp parallel num_threads(1)
#endif
  {
    std::vector<T> a;a.resize(100);
#if defined(_OPENMP)
#pragma omp for
#endif
    for(int i=0;i<10000;++i){
      worknested1(a);
    }
  }
};
void work(){
#if defined(_OPENMP)
#pragma omp parallel for num_threads(18)
#endif
  for(int i=0;i<1000000;++i){
    worknested0<double>();
  }
}
int main(){
  work();
  return(0);
}

 

to produce a nice memory leak when compiled with

g++ -Ofast -fopenmp -c test.cpp
g++ -L /opt/intel/oneapi/compiler/2022.0.2/linux/compiler/lib/intel64_lin -o exe test.o -Wl,--start-group -l iomp5 -l pthread -lm -ldl -Wl,--end-group

Depending on the the number of iterations in work it will eat the whole of my 256GB of RAM.

The g++ version is 11.2.

The problem does not occur with icpc 2021.5.0 and clang++ 13.0.1. Further, a workaround is to set num_threads(1) in worknested0 to num_threads(2).

Is there anything wrong in the code or is this a bug/incompatibility between g++ and Intel?

OS is Arch Linux, kernel 5.16.

OMP environment is:

OMP_PLACES=cores
OMP_PROC_BIND=true
OMP_DYNAMIC=FALSE
OMP_MAX_ACTIVE_LEVELS=2147483647
OMP_NUM_THREADS=18
OMP_STACKSIZE=2000M

 

0 Kudos
10 Replies
VidyalathaB_Intel
Moderator
1,858 Views

Hi Karl,


Thanks for reaching out to us.


>>incompatibility between g++ and Intel?

Indeed they are compatible, you can use the g++ compiler to link the application with the Intel C++ Compiler OpenMP compatibility library, and the way you are doing, it is already mentioned in the document

https://www.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/optimization-and-programming-guide/openmp-support/openmp-library-support/using-the-openmp-libraries.html


>>to produce a nice memory leak..it will eat the whole of my 256GB of RAM

Could you please let us know how did you check the memory leaks (any steps that you have followed ) in this case so that we can check it from our end also?


Regards,

Vidya.



0 Kudos
may_ka
Beginner
1,841 Views

Dear Vidya,

 

thanks a lot for your response.

Note that the actual compile command was

g++ -Ofast -fopenmp -c test.cpp
g++ -static -L /opt/intel/oneapi/compiler/2022.0.2/linux/compiler/lib/intel64_lin -o exe test.o -Wl,--start-group -l iomp5 -l pthread -lm -ldl -Wl,--end-group

Without "static" the problem won't show up.

 

For confirming the leak I just checked the memory build up in linux "top".

With 256 GB of RAM the functional version (e.g. compiled with clang++ or with g++ without the "-static" flag) sit at 0% RAM usage throughout the runtime, where the g++ version with "-static" will use all RAM and 100GB swap space until it gets killed by the OS.

 

Thanks for looking into this.

 

0 Kudos
VidyalathaB_Intel
Moderator
1,747 Views

Hi Karl,

 

>> the actual compile command was

Thanks for letting us know about it.

 

>>Without "static" the problem won't show up

I've tried to reproduce the issue with the "static" option with g++ compiler and observed the same behavior as you have mentioned (rapid growth in memory and then it is getting killed).

As per the documentation, it is strongly recommended to use dynamic linking of the OpenMP libraries.

https://www.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/compiler-reference/compiler-options/compiler-option-details/openmp-options-and-parallel-processing-options/qopenmp-link.html

Here are some of my observations with the given code:

If we still wish to continue with static linking(which is not recommended actually) of Intel OpenMP libraries we can use -qopenmp-link=static option with Intel C++ Compiler. I've tested with the below command and there are no memory leaks.

icpc -qopenmp-link=static -Ofast test.cpp -o exe

 

whereas the g++ compiler is also working fine without causing any memory leaks when linked dynamically with the Intel OpenMP library like what you have observed previously.

 

Please do let us know if it helps in resolving the issue.

 

Regards,

Vidya.

 

0 Kudos
VidyalathaB_Intel
Moderator
1,720 Views

Hi Karl,

 

A gentle reminder:

 

Could you please let us know if there is any update on your issue? Please do let us know if there is anything more that we could help with the issue.

 

Regards,

Vidya.

 

0 Kudos
may_ka
Beginner
1,701 Views

Hi VidyalathaB_Intel,

 

thanks a lot for looking into this.

For some reasons I have to stick to static linking. The DPC++ compiler produces code which is about 10% slower than that of g++. Therefore, fixing the issue would best alternative. However, that certainly depends on where the problem is, in gcc or in libiomp5?!

 

Best

0 Kudos
VidyalathaB_Intel
Moderator
1,680 Views

Hi Karl,


Thanks for getting back to us.

>>The DPC++ compiler produces code which is about 10% slower than that of g++

Could you please confirm the same with icpc compiler & icpx compiler instead of using the dpc++ compiler? Also do let us know how did you confirm it is 10% slower so that we can also check it from our end & proceed further in this case.


Regards,

Vidya.


0 Kudos
may_ka
Beginner
1,659 Views

Hi,

 

my code, which encompasses 25,000 lines and involves a lot of DIY classes, is used for iterative algorithms (e.g. solvers), and the 10% is the difference in seconds per iteration. I have no small scale code example which reproduces that observation.

 

With regard to the compiler I can give it a go, but where is the difference here between icpx and dpcpp?:

 

 

root > pwd
/opt/intel/oneapi/compiler/2022.0.2/linux/bin
root > ./dpcpp --version
Intel(R) oneAPI DPC++/C++ Compiler 2022.0.0 (2022.0.0.20211123)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/intel/oneapi/compiler/2022.0.2/linux/bin-llvm
root > ./icpx --version
Intel(R) oneAPI DPC++/C++ Compiler 2022.0.0 (2022.0.0.20211123)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/intel/oneapi/compiler/2022.0.2/linux/bin-llvm
root > ./../bin-llvm/clang++ --version
Intel(R) oneAPI DPC++/C++ Compiler 2022.0.0 (2022.0.0.20211123)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/intel/oneapi/compiler/2022.0.2/linux/bin/./../bin-llvm
root > ./intel64/icpc --version
icpc (ICC) 2021.5.0 20211109
Copyright (C) 1985-2021 Intel Corporation.  All rights reserved.

 

With regard to icpc I have given up because:

In file included from /usr/include/c++/11.2.0/cwchar(44),
                 from /usr/include/c++/11.2.0/bits/postypes.h(40),
                 from /usr/include/c++/11.2.0/bits/char_traits.h(40),
                 from /usr/include/c++/11.2.0/string(40),
                 from src/../incl/jarray.hpp(3),
                 from src/jarray.cpp(1):
/usr/include/wchar.h(155): error: attribute "__malloc__" does not take arguments
    __attribute_malloc__ __attr_dealloc_free;

with libc version

 

root > ldd --version
ldd (GNU libc) 2.35
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

 

It appears as if this is problem in other compilers as well. Which seems to come from using later version of libc. But I also understood that icpc is not developed anymore as dpcpp is the new target platform.

0 Kudos
VidyalathaB_Intel
Moderator
1,629 Views

Hi Karl,


We have forwarded the issue "with respect to memory leaks with static linking of libiomp5 with g++ compiler" to the concerned development team and we will keep updating this thread.

>>The DPC++ compiler produces code which is about 10% slower than that of g++

Meanwhile, you can share your test code with us if you are comfortable with it. Please let us know if you are interested in providing the details so that we can contact you privately.


Regards,

Vidya.


0 Kudos
VidyalathaB_Intel
Moderator
1,292 Views

Hi @may_ka ,

 

Thanks for your patience.

The issue raised by you is fixed in the latest release of oneAPI which is 2023.0.0. Please update to the latest version and try running the code and do let us know if it resolves the issue.

 

Regards,

Vidya.

 

0 Kudos
VidyalathaB_Intel
Moderator
1,267 Views

Hi @may_ka ,

 

As the issue is already addressed and the fix is provided, we are going ahead and closing this issue. Please post a new question if you need any additional assistance from Intel as this thread will no longer be monitored.

 

Regards,

Vidya.

 

0 Kudos
Reply