Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
39 Views

Memory leak in service_topo.cpp on Windows x64 with core count more 32

  1. On Windows x64 system in service_topo.h type LNX_PTR2INT is defined as follows:
    #ifdef _x86_64
           #define LNX_PTR2INT __int64
           #define LNX_MY1CON 1LL
    #else
           #define LNX_PTR2INT unsigned int
           #define LNX_MY1CON 1
    #endif
    Under Windows x64 macro _M_X64 should be used (to use more than 32 cores but not more than 64). Some other solution should be used to work on machines with more than 64 cores.
    Because LNX_PTR2INT is defined as unsigned int which on Windows is 32 bit
    then function static int __internal_daal_countBits(DWORD_PTR x) return maximum of 32,
    then functiion static void __internal_daal_setChkProcessAffinityConsistency( unsigned int lcl_OSProcessorCount )
    in statement on line 328
    if( sum != lcl_OSProcessorCount ) // check cumulative bit counts matches processor count
    determines inconsistensy;
    then in function static int __internal_daal_queryParseSubIDs(void) in statement on line 1346
    if( glbl_obj.error )
    return -1;
    error is detected which is returned to function static void __internal_daal_buildSystemTopologyTables()
    which leads to exit without initialization
  2. In function static void __internal_daal_buildSystemTopologyTables()
    tables are allocated on line 1683 and should be deallocated in case of an error
    before returns on lines: 1687, 1689, 1694
    function __internal_daal_buildSystemTopologyTables is called from
    __internal_daal_buildSystemTopologyTables which is called from
    __internal_daal_initCpuTopology which is called from
    __internal_daal_GetSysProcessorCoreCount which is called from
    GetL1CacheSize which is called in evere call to compute
0 Kudos
5 Replies
Highlighted
39 Views

Hello Alexander!
1. now we are thinking about disabling ThreadPinning functionality at Windows. Do you use ThreadPinning?
We fix it, if it will be decided to keep ThreadPinning.
2. glktsn has destructor (glktsn::FreeArrays() in the same source). all allocated tables free in it.
Andrey

dr_pain
0 Kudos
Highlighted
39 Views

​Hello, Andrey!

  1. No, I do not use ThreadPinning. I discovered this effect wondering why the same program works on my laptop with 8 cores and 16 gb RAM, and run out of memory on 40 core workstation with 128gb RAM
  2. I saw this destructor. The problem is that it is never called since after the inconsistency in setChkProcessAffinityConsistency flag init is not set and next call to compute will call getL1CacheSize and getLLCacheSize. Both will call buildSystemTopologyTables and allocate additional memory but will not set init flag in glktsn and next call will allocate additional memory and so on. So destructor is never called. There should be calls to destructor after each check of an error before return without setting init flag but after allocation.
0 Kudos
Highlighted
39 Views

Hello!
Ok. Now I understand the usage case and see where problem is.
If you use open source DAAL, as quick workaround, I suggest to build libs with define DAAL_CPU_TOPO_DISABLED.
Andrey

dr_pain
0 Kudos
Highlighted
39 Views

Hello, Andrey

Yes, there are two problems:

1. Using macro _x86_64 instead of _M_X64 in service_topo.h (at least on with Windows with MSVC compiler)

2. Incorrect clean up after error.

I have corrected issue #1 and everithing is working (at least for less than 64 cores)

Thanks,

Alexander

0 Kudos
Highlighted
39 Views

Hello Alexander!
We will fix p1 in upcoming DAAL version.
But I prefer to use _WIN64 macro because it already used in DAAL to determine Windows 64bits.
Andrey
PS:  I have read about _M_X64/_M_AMD64/_WIN64 already :-)

dr_pain
0 Kudos