Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
Announcements
FPGA community forums and blogs have moved to the Altera Community. Existing Intel Community members can sign in with their current credentials.

Multimple IppInit() and memory leak with _IPP_PARALLEL_STATIC at ippcore_t

Yaroslav_Korchevsky
3,609 Views
I use statically linked IPP in mulithreading application which consists of many modules.
Modules may be loaded and unloaded.
Each IPP linked module require IPP to be initialized.
However, as it is known, IppInit() is required to be called once per process.
Multiple calls to IppInit() produces memory leaks.

Use IppInit() in main module is not allowed by design.
So, each module at load time needs to decide is IppInit() required or not.
I didn't find any way to make my application call IppInit() only once per process.

Does anybody know is there any method to find is IPP initialized already or not?

PS. Environment: VisualStudio C++ 2008
0 Kudos
1 Solution
Sergey_K_Intel
Employee
3,608 Views
Yaroslav,
I see _kmp_str_format call wher you have pointed at. It allocates about 3K bytes for setting environment variable. But, after FreeLibrary the allocated memory gets back to operating system.
The problem is not in that kmp_str_format allocates something. The question is why this memory is not returned back after FreeLibrary call?
Could you ask this on Intel compiler forum?
By the way, you can prepare the test sample without IPP, only with OpenMP. Create a empty DLL linked to libiomp5 (i.e. Intel's OpenMP). Call omp_get_num_procs, then unload your custom DLL. Will the behavour be the same?
Regards,
Sergey

View solution in original post

0 Kudos
26 Replies
SergeyKostrov
Valued Contributor II
3,038 Views
Quoting jslav
...Does anybody know is there any method to find if IPP initialized already or not?..

You can use classic programmingmethodsnot related to IPP:

- a global variable,or astaticmember of a class,whichis seen for all your loadablemodules
- a synchronization object, like a global mutex
- a hidden UI window (then use Win32 API functions, like FindWindow or FindWindowEx )
- a file based "cookie" in some folder

So, after 1st call toIppInitsome "global object" has to beinitialized, or set to some value,as well.All the rest
modules, as soon as they are loaded,should verifyif the"global object" is initialized, orit has some value,
indicating that IPP is initialized.

Best regards,
Sergey
0 Kudos
Sergey_K_Intel
Employee
3,038 Views
Hi,
You can calll "ippInit" as many times as you need. All internal tables are initialized only once, all subsequent calls to "ippInit" do nothing.
How did you detect memory leaks in "ippInit"? It doesn't work with dynamic memory.
Regards,
Sergey
0 Kudos
SergeyKostrov
Valued Contributor II
3,038 Views
...How did you detect memory leaks in "ippInit"? It doesn't work with dynamic memory...


It is very easy to verify and here is a test-case:

...
for( int t = 0; t < 65536; t++ ) // 2^16
{
// Sub-Test 1 - Verification for Memory Leaks
IppInit();
Sleep( 10 );

// Sub-Test 2 - Forced Memory Leaks ( attempt to allocate 1GB of memory )
// int *piData = ( int * )malloc( 4096 * sizeof( int ) );
}
...

and then monitor Page File Usage History inaPerformance property page of the Windows Task Manager.

Best regards,
Sergey

0 Kudos
Sergey_K_Intel
Employee
3,038 Views
Hi Sergey,
I checked this.
Nothing illegal has been met. Task's memory footprint (working set, sharable memory amount, private memory amount) is not changing during the run.
I went through "ippInit" with the debugger.
It does nothing, but gets CPUID, stores it in static variable, sets a couple of other static variables and that's it.
It doesn't allocate anything or so.
Thus, "ippInit" is safe for multiple calls from whatever part of application.
Regards,
Sergey
0 Kudos
Yaroslav_Korchevsky
3,038 Views
This entry has been DELETED
0 Kudos
Yaroslav_Korchevsky
3,038 Views
Sergey.

I made an example, which demonstrates main memory leak.
ippInit-leak.zip

There are 2 projects.
1 - is main module which does in loop:
1) Load DLL
2) calls the only method CallMe() from DLL
3) Unloads DLL

2 - is the DLL, which only calls ippInit()
Nothing else.

As I see, the issue is that IPP memory is not released at DLL unloading.
0 Kudos
Sergey_K_Intel
Employee
3,038 Views
Hi Yaroslav,
With your example everything is still OK. Task manager statistics doesn't show process memory increase.
Let's add process memory enquire into the loop:
[cpp] unsigned int calls = 0; void* ptr; PROCESS_MEMORY_COUNTERS_EX pmc; while (1) { if(!TheTest()) { printf("Test failedn"); break; } if((++calls % 10000) == 0) { if (!GetProcessMemoryInfo(GetCurrentProcess(), (PPROCESS_MEMORY_COUNTERS)&pmc, sizeof(pmc))) { printf("GetProcessMemoryInfo failed!n"); break; } printf("calls %dn", calls); printf("Mem= private mem: %luK, working set: %luK, peak working set: %luK, page file usage: %luKn", pmc.PrivateUsage/1024, pmc.WorkingSetSize/1024, pmc.PeakWorkingSetSize/1024, pmc.PagefileUsage/1024); } if((calls % 100) == 0) ptr = malloc(1024); } [/cpp]
With real memory leak (malloc at the top is not commented) we see:
calls 10000Mem= private mem: 1784K, working set: 3124K, peak working set: 3948K, page file usage: 1784K
calls 20000Mem= private mem: 1932K, working set: 3312K, peak working set: 4140K, page file usage: 1932K
calls 30000Mem= private mem: 2028K, working set: 3408K, peak working set: 4236K, page file usage: 2028K
calls 40000Mem= private mem: 2156K, working set: 3536K, peak working set: 4364K, page file usage: 2156K
calls 50000Mem= private mem: 2252K, working set: 3632K, peak working set: 4460K, page file usage: 2252K
calls 60000Mem= private mem: 2348K, working set: 3728K, peak working set: 4556K, page file usage: 2348K
calls 70000Mem= private mem: 2476K, working set: 3856K, peak working set: 4684K, page file usage: 2476K
calls 80000Mem= private mem: 2608K, working set: 3984K, peak working set: 4812K, page file usage: 2608K
calls 90000Mem= private mem: 2744K, working set: 4116K, peak working set: 4944K, page file usage: 2744K
calls 100000Mem= private mem: 2808K, working set: 4180K, peak working set: 5008K, page file usage: 2808K
calls 110000Mem= private mem: 2936K, working set: 4308K, peak working set: 5136K, page file usage: 2936K
calls 120000Mem= private mem: 3064K, working set: 4436K, peak working set: 5264K, page file usage: 3064K
Without(i.e. malloc is commented out):
calls 10000Mem= private mem: 1528K, working set: 2860K, peak working set: 3688K, page file usage: 1528K
calls 20000Mem= private mem: 1540K, working set: 2912K, peak working set: 3744K, page file usage: 1540K
calls 30000Mem= private mem: 1540K, working set: 2912K, peak working set: 3744K, page file usage: 1540K
calls 40000Mem= private mem: 1540K, working set: 2912K, peak working set: 3744K, page file usage: 1540K
calls 50000Mem= private mem: 1540K, working set: 2912K, peak working set: 3744K, page file usage: 1540K
calls 60000Mem= private mem: 1540K, working set: 2912K, peak working set: 3744K, page file usage: 1540K
calls 70000Mem= private mem: 1540K, working set: 2912K, peak working set: 3744K, page file usage: 1540K
calls 80000Mem= private mem: 1540K, working set: 2912K, peak working set: 3744K, page file usage: 1540K
calls 90000Mem= private mem: 1540K, working set: 2912K, peak working set: 3744K, page file usage: 1540K
calls 100000Mem= private mem: 1540K, working set: 2912K, peak working set: 3744K, page file usage: 1540K
calls 110000Mem= private mem: 1540K, working set: 2912K, peak working set: 3744K, page file usage: 1540K
calls 120000Mem= private mem: 1540K, working set: 2912K, peak working set: 3744K, page file usage: 1540K
By the way, do you use IPP 7.0.7 ?
Regards,
Sergey
0 Kudos
Yaroslav_Korchevsky
3,038 Views
I have differnt output:
calls 100 :: Mem= private mem: 218436K, working set: 20780K, peak working set: 21020K, page file usage: 218436K
calls 200 :: Mem= private mem: 436464K, working set: 40020K, peak working set: 40260K, page file usage: 436464K
calls 300 :: Mem= private mem: 654488K, working set: 59244K, peak working set: 59484K, page file usage: 654488K
calls 400 :: Mem= private mem: 872512K, working set: 78468K, peak working set: 78708K, page file usage: 872512K
calls 500 :: Mem= private mem: 1090536K, working set: 97692K, peak working set: 97932K, page file usage: 1090536K
calls 600 :: Mem= private mem: 1308564K, working set: 116920K, peak working set: 117160K, page file usage: 1308564K
calls 700 :: Mem= private mem: 1526588K, working set: 136144K, peak working set: 136384K, page file usage: 1526588K
calls 800 :: Mem= private mem: 1744612K, working set: 155368K, peak working set: 155608K, page file usage: 1744612K
calls 900 :: Mem= private mem: 1962664K, working set: 174620K, peak working set: 174856K, page file usage: 1962664K

The memory manager cannot access sufficient memory to initialize; exiting

Yes, I use IPP 7.0.7

I Attach new version of project, which produces this result.
!!!!! ippInit-leak.zip !!!!! <--- UPDATED second version

When I comment out the ippInit() call memory leak disappears.
Log values are stable.
0 Kudos
Yaroslav_Korchevsky
3,038 Views
By mistake I attached the first version of example to the last message
Now, it is updated to second
I duplicate the link here:
ippInit-leak.zip
0 Kudos
Yaroslav_Korchevsky
3,038 Views
The question still remains.
Why ippInit() leaks for me, but doesn't leak for you.

I decided to provide binaries. May be it will work differenttly in yor environment
ippInit-leak-with-binaries.zip
Archive contains both debug and release builds.
0 Kudos
Sergey_K_Intel
Employee
3,038 Views
Yaroslav,
I get very different results.
calls 100 :: Mem= private mem: 616K, working set: 2152K, peak working set: 2552K, page file usage: 616K
...
calls 1100 :: Mem= private mem: 632K, working set: 2168K, peak working set: 2572K, page file usage: 632K
...
calls 1400 :: Mem= private mem: 896K, working set: 2300K, peak working set: 2704K, page file usage: 896K
...
calls 2100 :: Mem= private mem: 1164K, working set: 2564K, peak working set: 2968K, page file usage: 1164K
...
calls 4100 :: Mem= private mem: 1184K, working set: 2596K, peak working set: 3000K, page file usage: 1184K
calls 633200 :: Mem= private mem: 1184K, working set: 2596K, peak working set: 3000K, page file usage: 1184K
...
The same gradual memory increase (and then flat) is seen if ippInit is not called (the numbers are different). This might be MS specifics.
Though, I don't have VS2005. Checked on 2008/2010.With static single-thread library (ippcore_l.lib) the picture is the same.
How could we move on in investigation?
Regard,
Sergey
0 Kudos
Sergey_K_Intel
Employee
3,038 Views
Got the same test application behaviour as you. Will be looking into.
Regards,
Sergey
0 Kudos
SergeyKostrov
Valued Contributor II
3,038 Views
...As I see, the issue is that IPP memory is not released at DLL unloading.


...
#pragma comment(lib, "libiomp5mt.lib")
...

What ifthat problemis related tosome memory leaks in libiomp5mt.dll?

Best regards,
Sergey

0 Kudos
Sergey_K_Intel
Employee
3,038 Views
First of all, Yaroslav needs to simplify his sample. Say, to make it single module (no DLL) and link to single-threaded libraries. Then, if there will be no prolem, we can add complexity. To detect when problem arises.
Regards,
Sergey
0 Kudos
Yaroslav_Korchevsky
3,038 Views
ippcore_l.lib works fine for me as well.
In this configuration.

So, The issue is in ippcore_t.lib
Leak is there.
0 Kudos
Yaroslav_Korchevsky
3,038 Views
Sergey,
can you help me with this issue?
0 Kudos
Sergey_K_Intel
Employee
3,038 Views
Yaroslav,
What I can see from debugger (you can use Disassembly mode, ippInit it not big) +task manager to check memory size):
- with ippcore_t library ippStaticInit right after entrance calls omp_get_num_procs().
- the memory size jumps from ~480K to 712K. Initially there was ~380K + something added after LoadLibrary()
- then memsize remains flat up to the end of ippInit
- after FreeLibrary() call the memory size returns to its initial value (less than 400K).
Unfortunately, I cannot run your debug-mode binary file (some MS DLLs are missing on my computer).
My suspicion is that after FreeLibrary() call your application does not release memory.
Could you try internal debugging?
Regards,
Sergey
0 Kudos
Yaroslav_Korchevsky
3,038 Views
I tried to debug as you advised.
The issue is that omp_get_num_procs() allocates memory.
In deeper levels I found that memory has been allocated in ___kmp_str_format
are you aware what is it?

Here is backtrace where memory is allocated:

> Module+ipp.dll!___kmp_str_format() + 0x14 bytes
Module+ipp.dll!___kmp_serial_initialize() + 0xe5 bytes
Module+ipp.dll!___kmp_middle_initialize() + 0x1bf bytes
Module+ipp.dll!_omp_get_num_procs() + 0x1a bytes
Module+ipp.dll!_ippStaticInit@0() + 0xa bytes
ModularTest.exe!TheTest() Line 39 + 0x5 bytes C++
ModularTest.exe!main() Line 51 + 0x5 bytes C++
ModularTest.exe!__tmainCRTStartup() Line 597 + 0x19 bytes C
ModularTest.exe!mainCRTStartup() Line 414 C
kernel32.dll!760ced6c()
[Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]
ntdll.dll!77c737f5()
ntdll.dll!77c737c8()

0 Kudos
Sergey_K_Intel
Employee
3,609 Views
Yaroslav,
I see _kmp_str_format call wher you have pointed at. It allocates about 3K bytes for setting environment variable. But, after FreeLibrary the allocated memory gets back to operating system.
The problem is not in that kmp_str_format allocates something. The question is why this memory is not returned back after FreeLibrary call?
Could you ask this on Intel compiler forum?
By the way, you can prepare the test sample without IPP, only with OpenMP. Create a empty DLL linked to libiomp5 (i.e. Intel's OpenMP). Call omp_get_num_procs, then unload your custom DLL. Will the behavour be the same?
Regards,
Sergey
0 Kudos
SergeyKostrov
Valued Contributor II
2,928 Views
Hi Sergey,

Quoting Sergey Khlystov (Intel)
...The problem is not in that kmp_str_format allocates something. The question is why this memory is not returned back after FreeLibrary call?..


Could you try to use Intel Inspector XE? It is interesting to see if it couldreport some memory leaks related problems.

Best regards,
Sergey

0 Kudos
Reply