Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

TBB memory leaks even when entire program is commented out

Navin_I_
Beginner
1,192 Views
Hi. I encountered TBB memory leaks in my program, so decided to try out a smaller program to check: Tried this: #include "tbb/parallel_for_each.h" #include "tbb/task_scheduler_init.h" #include #include struct mytask { mytask(size_t n) :_n(n) {} void operator()() { for (int i=0;i<1000000;++i) {} // Deliberately run slow std::cerr << "[" << _n << "]"; } size_t _n; }; template struct invoker { void operator()(T& it) const {it();} }; int main(int,char**) { tbb::task_scheduler_init init; // Automatic number of threads // tbb::task_scheduler_init init(4); // Explicit number of threads std::vector tasks; for (int i=0;i<1000;++i) tasks.push_back(mytask(i)); tbb::parallel_for_each(tasks.begin(),tasks.end(),invoker()); std::cerr << std::endl; return 0; } and when run with valgrind --leak-check="full", I got: ==3925== ==3925== HEAP SUMMARY: ==3925== in use at exit: 2,112 bytes in 7 blocks ==3925== total heap usage: 27 allocs, 20 frees, 19,721 bytes allocated ==3925== ==3925== 264 bytes in 1 blocks are possibly lost in loss record 5 of 7 ==3925== at 0x4A07982: operator new[](unsigned long) (vg_replace_malloc.c:389) ==3925== by 0x4C46BC0: tbb::internal::arena::arena(tbb::internal::market&, unsigned int) (in /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so.2) ==3925== by 0x4C46F2B: tbb::internal::arena::allocate_arena(tbb::internal::market&, unsigned int) (in /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so.2) ==3925== by 0x4C462FE: tbb::internal::market::create_arena(unsigned int, unsigned long) (in /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so.2) ==3925== by 0x4C45E82: tbb::internal::governor::init_scheduler(unsigned int, unsigned long, bool) (in /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so.2) ==3925== by 0x4C45F35: tbb::task_scheduler_init::initialize(int, unsigned long) (in /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so.2) ==3925== by 0x401D24: tbb::task_scheduler_init::task_scheduler_init(int, unsigned long) (task_scheduler_init.h:81) ==3925== by 0x40161B: main (main.cpp:23) ==3925== ==3925== 288 bytes in 1 blocks are possibly lost in loss record 6 of 7 ==3925== at 0x4A05F6F: calloc (vg_replace_malloc.c:623) ==3925== by 0x302BA11952: _dl_allocate_tls (in /lib64/ld-2.12.so) ==3925== by 0x302C6071E8: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.12.so) ==3925== by 0x4C44B80: tbb::internal::rml::private_server::private_server(tbb::internal::rml::tbb_client&) (in /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so.2) ==3925== by 0x4C44D48: tbb::internal::rml::make_private_server(tbb::internal::rml::tbb_client&) (in /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so.2) ==3925== by 0x4C45C66: tbb::internal::governor::create_rml_server(tbb::internal::rml::tbb_client&) (in /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so.2) ==3925== by 0x4C45FFB: tbb::internal::market::market(unsigned int, unsigned long) (in /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so.2) ==3925== by 0x4C4652D: tbb::internal::market::global_market(unsigned int, unsigned long) (in /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so.2) ==3925== by 0x4C462EA: tbb::internal::market::create_arena(unsigned int, unsigned long) (in /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so.2) ==3925== by 0x4C45E82: tbb::internal::governor::init_scheduler(unsigned int, unsigned long, bool) (in /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so.2) ==3925== by 0x4C45F35: tbb::task_scheduler_init::initialize(int, unsigned long) (in /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so.2) ==3925== by 0x401D24: tbb::task_scheduler_init::task_scheduler_init(int, unsigned long) (task_scheduler_init.h:81) ==3925== ==3925== LEAK SUMMARY: ==3925== definitely lost: 0 bytes in 0 blocks ==3925== indirectly lost: 0 bytes in 0 blocks ==3925== possibly lost: 552 bytes in 2 blocks ==3925== still reachable: 1,560 bytes in 5 blocks ==3925== suppressed: 0 bytes in 0 blocks ==3925== Reachable blocks (those to which a pointer was found) are not shown. ==3925== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==3925== ==3925== For counts of detected and suppressed errors, rerun with: -v ==3925== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 4 from 4) Then I commented out all the code in main() and the mytask struct, built the program and ran with valgrind and this is what I got: [nkipe@localhost GNU-Linux-x86]$ clear;valgrind --leak-check="full" ./trial ==3993== Memcheck, a memory error detector ==3993== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==3993== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info ==3993== Command: ./trial ==3993== ==3993== ==3993== HEAP SUMMARY: ==3993== in use at exit: 76 bytes in 2 blocks ==3993== total heap usage: 3 allocs, 1 frees, 120 bytes allocated ==3993== ==3993== LEAK SUMMARY: ==3993== definitely lost: 0 bytes in 0 blocks ==3993== indirectly lost: 0 bytes in 0 blocks ==3993== possibly lost: 0 bytes in 0 blocks ==3993== still reachable: 76 bytes in 2 blocks ==3993== suppressed: 0 bytes in 0 blocks ==3993== Reachable blocks (those to which a pointer was found) are not shown. ==3993== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==3993== ==3993== For counts of detected and suppressed errors, rerun with: -v ==3993== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4) Why is there a leak even when the code is commented out? This is my gcc version: [nkipe@localhost GNU-Linux-x86]$ gcc -v Using built-in specs. Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC) OS: [nkipe@localhost GNU-Linux-x86]$ uname -a Linux localhost.localdomain 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux The versions of TBB I have in /usr/local/: drwxr-xr-x. 7 16315 2222 4.0K Mar 16 17:46 tbb43_20150316oss drwxr-xr-x. 7 nkipe nkipe 4.0K Apr 3 19:21 tbb30_20100406oss Project Settings: C++ include: /usr/local/tbb30_20100406oss/include Linker: /usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/libtbb.so Additional options (without which the linker says it can't find libtbb.so.2: -L/usr/local/tbb30_20100406oss/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -I/usr/local/tbb30_20100406oss/include
0 Kudos
10 Replies
Vladimir_P_1234567890
1,192 Views

Hello,

These objects are created for detached threads and live till end of the program. Objects are destroyed when program ends. you can check this by calling parallel_for_each for 1000 times on the same data. If there are much more leaks then there is a memory leak

Consider the sample - in the sample below you will get similar leaks.

#include <thread>

int main() {
    std::thread t([](){});
    return 0;
}

--Vladimir

0 Kudos
Navin_I_
Beginner
1,192 Views
Thank you Vladimir. So I assume I shouldn't be bothered about the leaks? But there's also the other issue I pointed out, where all the code in main() and the struct is commented out, and there is still a leak.
0 Kudos
Vladimir_P_1234567890
1,192 Views

Hello,

I suppose you have comment everyhing out but still link with tbb. In this case there will be static initialization of TLS storage with auto_terminate key on program exit. 

Regarding "be bothered about the leaks": you are not first who reports this false positive so it would be good to understand how we can hide this diagnistics for these particular hits.

--Vladimir

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,192 Views

Vladimir,

These are not false positives. While it is true that the application program (user code) does not have a leak, none the less, heap space was allocated (e.g. static initialization of TLS storage with auto_terminate key on program exit, etc...), yet not returned. Hiding the information is not an appropriate action (even though it would cut down on the complaints about memory leaks). A better route IMHO is to have a static or stack space, that during initialization (where these persistent objects are allocated) perform a heap context switch to use the static or stack space, then heap context back just prior to entry into main. Seeing that you know how to overload new, delete, malloc, free, this should be a QED. The heap contex switch may have to take into consideration any additional context introduced with valgrind (though that be hidden with the overload).

Jim Dempsey

0 Kudos
Vladimir_P_1234567890
1,192 Views

This looks like an idea to make yet another TBB allocator:)

But anyway putting static structures to .data section sounds reasonable...

thanks,
--Vladimir

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,192 Views

Vladimir,

Since it may be that you only have a few such dynamically allocated objects (or at least a few different types), the easiest route might be to use "placement new" on a piece of a static data block. IOW you do not malloc. use alloca if on the pre-main call stack or just grab the next n bytes of the static data block (aligned if you wish).

Jim Dempsey

0 Kudos
rnickb
Beginner
1,192 Views

I'd also like to see some solution. Clang's address sanitizer complains about this same issue:

rnburn@localhost ~/bugs/tbb_leak $ ./a.out 

=================================================================
==19570==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 3096 byte(s) in 3 object(s) allocated from:
    #0 0x4cb18b in operator new[](unsigned long) /root/clang-march30_15/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:64:37
    #1 0x7f7666e1b21e in tbb::internal::task_stream<3>::initialize(unsigned int) /root/tbb/tbb43_20141023oss/build/linux_intel64_gcc_cc4.7.3_libc2.17_kernel3.10.25_debug/../../src/tbb/task_stream.h:94
    #2 0x7f7666e17658 in tbb::internal::arena::arena(tbb::internal::market&, unsigned int) /root/tbb/tbb43_20141023oss/build/linux_intel64_gcc_cc4.7.3_libc2.17_kernel3.10.25_debug/../../src/tbb/arena.cpp:167
    #3 0x7f7666e17812 in tbb::internal::arena::allocate_arena(tbb::internal::market&, unsigned int) /root/tbb/tbb43_20141023oss/build/linux_intel64_gcc_cc4.7.3_libc2.17_kernel3.10.25_debug/../../src/tbb/arena.cpp:191
    #4 0x7f7666e14504 in tbb::internal::market::create_arena(unsigned int, unsigned long) /root/tbb/tbb43_20141023oss/build/linux_intel64_gcc_cc4.7.3_libc2.17_kernel3.10.25_debug/../../src/tbb/market.cpp:164
    #5 0x7f7666e133a1 in tbb::internal::governor::init_scheduler(unsigned int, unsigned long, bool) /root/tbb/tbb43_20141023oss/build/linux_intel64_gcc_cc4.7.3_libc2.17_kernel3.10.25_debug/../../src/tbb/governor.cpp:163
    #6 0x7f7666e10446 in tbb::internal::governor::local_scheduler() /root/tbb/tbb43_20141023oss/build/linux_intel64_gcc_cc4.7.3_libc2.17_kernel3.10.25_debug/../../src/tbb/governor.h:119
    #7 0x7f7666e0f979 in tbb::internal::allocate_root_with_context_proxy::allocate(unsigned long) const /root/tbb/tbb43_20141023oss/build/linux_intel64_gcc_cc4.7.3_libc2.17_kernel3.10.25_debug/../../src/tbb/task.cpp:67
    #8 0x4cd8e1 in operator new(unsigned long, tbb::internal::allocate_root_with_context_proxy const&) /usr/local/include/tbb/task.h:975:13
    #9 0x4cc45a in tbb::interface7::internal::start_for<tbb::blocked_range<int>, tbb::internal::parallel_for_body<main::$_0, int>, tbb::auto_partitioner const>::run(tbb::blocked_range<int> const&, tbb::internal::parallel_for_body<main::$_0, int> const&, tbb::auto_partitioner const&) /usr/local/include/tbb/parallel_for.h:88:33
    #10 0x4cc1d3 in void tbb::strict_ppl::parallel_for_impl<int, main::$_0, tbb::auto_partitioner const>(int, int, int, main::$_0 const&, tbb::auto_partitioner const&) /usr/local/include/tbb/parallel_for.h:254:9
    #11 0x4cc078 in void tbb::strict_ppl::parallel_for<int, main::$_0>(int, int, main::$_0 const&) /usr/local/include/tbb/parallel_for.h:282:5
    #12 0x4cbfad in main /home/rnburn/bugs/tbb_leak/main.cpp:15:3
    #13 0x7f7665f1fb94 in __libc_start_main (/lib64/libc.so.6+0x24b94)

SUMMARY: AddressSanitizer: 3096 byte(s) leaked in 3 allocation(s).
0 Kudos
Vladimir_P_1234567890
1,192 Views

This is not the same issue. As far as I can see there is a parallel_for call but not empty program.

--Vladimir

0 Kudos
Vladimir_P_1234567890
1,192 Views

jimdempseyatthecove wrote:

Since it may be that you only have a few such dynamically allocated objects (or at least a few different types), the easiest route might be to use "placement new" on a piece of a static data block. IOW you do not malloc. use alloca if on the pre-main call stack or just grab the next n bytes of the static data block (aligned if you wish).

We have checked again, all needed objects are static and located in .data section. Memory leak reported due to pthread_key_create() call. We can't control pthread library.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,192 Views

Then it might be advisable to note this in a FAQ or known issues section of the documents (I didn't look to see if this is in there already).

Jim Dempsey

0 Kudos
Reply