- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The 'scallable_allocator' also fails to allocate a memory for a last block ( see Tests 4 and 5 ).
[EDITED] Please see a Post #3 for updated descriptions of theseproblems:
http://software.intel.com/en-us/forums/showpost.php?p=191121
Here are results of a stresstest-case (32-bit / Release configuration )for a preliminary review:
>> Test 1 <<
Number of Memory Blocks: 4192
Size of Memory Block : 32768 bytes
Total Amount of Memory : 0.13 GB
[ CRT malloc ] All memory blocks are allocated - 31 ticks
[ CRT free ] All memory blocks are released - 16 ticks
Press ENTER to continue...
[ TBB scalable_allocator ] All memory blocks are allocated - 31 ticks
[ TBB deallocate ] All memory blocks are released - 0 ticks
Press ENTER to exit...
>> Test 2 <<
Number of Memory Blocks: 8192
Size of Memory Block : 32768 bytes
Total Amount of Memory : 0.25 GB
[ CRT malloc ] All memory blocks are allocated - 63 ticks
[ CRT free ] All memory blocks are released - 15 ticks
Press ENTER to continue...
[ TBB scalable_allocator ] All memory blocks are allocated - 46 ticks
[ TBB deallocate ] All memory blocks are released - 0 ticks
Press ENTER to exit...
>> Test 3 <<
Number of Memory Blocks: 16384
Size of Memory Block : 32768 bytes
Total Amount of Memory : 0.50 GB
[ CRT malloc ] All memory blocks are allocated - 141 ticks
[ CRT free ] All memory blocks are released - 31 ticks
Press ENTER to continue...
[ TBB scalable_allocator ] All memory blocks are allocated - 94 ticks
[ TBB deallocate ] All memory blocks are released - 16 ticks
Press ENTER to exit...
>> Test 4 <<
Number of Memory Blocks: 32768
Size of Memory Block : 32768 bytes
Total Amount of Memory : 1.00 GB
[ CRT malloc ] All memory blocks are allocated - 406 ticks
[ CRT free ] All memory blocks are released - 78 ticks
Press ENTER to continue...
[ TBB scalable_allocator ] All memory blocks are allocated - 94 ticks
[ TBB deallocate ] All memory blocks are released - 16 ticks
Error: [ TBB scalable_allocator ] Failed to allocate a memory block #32768 - SysError: 0
Press ENTER to exit...
>> Test 5 <<
Number of Memory Blocks: 49152
Size of Memory Block : 32768 bytes
Total Amount of Memory : 1.50 GB
[ CRT malloc ] All memory blocks are allocated - 609 ticks
[ CRT free ] All memory blocks are released - 94 ticks
Press ENTER to continue...
[ TBB scalable_allocator ] All memory blocks are allocated - 62 ticks
[ TBB deallocate ] All memory blocks are released - 16 ticks
Error: [ TBB scalable_allocator ] Failed to allocate a memory block #49152 - SysError: 0
Press ENTER to exit...
>> Test 6 <<
Number of Memory Blocks: 65536
Size of Memory Block : 32768 bytes
Total Amount of Memory : 2.00 GB
[ CRT malloc ] All memory blocks are allocated - 1328 ticks
[ CRT free ] All memory blocks are released - 234 ticks
Press ENTER to continue...
The memory manager cannot access sufficient memory to initialize; exiting
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you please provide us with the reproducer?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A short answer: Yes.
( ...in a couple of minutes... )
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The 'scallable_allocator' also fails to allocate a memory for a last block ( see Tests 4 and 5 )...
I decided to change definitions of these two problems:
>> Problem #1 <<
TBB 'scalable_allocator' doesn't outperform CRT 'malloc' when an application needs to
allocate more than ~1.54GB of memory in total ( not as one large block! )
>> Problem #2 <<
TBB 'scalable_allocator' fails completely after ~1.97GB of memory was allocated and
then released (!) by CRT 'malloc'. An application exits with a TBB error message:
The memory manager cannot access sufficient memory to initialize; exiting
I'd like to note that ~1.97GB of memory is currently needed for some algorithm on a 32-bit Windowsplatform.
Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
to completere-testing. So, here are the source codes:
[cpp]// Stress Tests for CRT 'malloc' and TBB 'scalable_allocator' { ///* #if ( defined ( _WIN32_MSC ) || defined ( _WIN32_ICC ) ) #define _TEST_CRTMALLOC // Configuration Macros for Tests #define _TEST_TBBSCALABLEALLOCATOR /* Notes: // #define _TEST_CRTMALLOC // Case 1 // #define _TEST_TBBSCALABLEALLOCATOR #define _TEST_CRTMALLOC // Case 2 // #define _TEST_TBBSCALABLEALLOCATOR // #define _TEST_CRTMALLOC // Case 3 #define _TEST_TBBSCALABLEALLOCATOR #define _TEST_CRTMALLOC // Case 4 #define _TEST_TBBSCALABLEALLOCATOR */ // Attention: Results are for the Case 4 // ( Win32 / Release configuration ) #define _SIZE_OF_MEMBLOCK 8192 // CRT TBB // malloc scalable_allocator // const RTint _NUM_OF_MEMORYBLOCKS = 4192; // 0.13GB - OK OK // const RTint _NUM_OF_MEMORYBLOCKS = 8192; // 0.25GB - OK OK // const RTint _NUM_OF_MEMORYBLOCKS = 16384; // 0.50GB - OK OK // const RTint _NUM_OF_MEMORYBLOCKS = 32768; // 1.00GB - OK OK // const RTint _NUM_OF_MEMORYBLOCKS = 65536 - 16384; // 1.50GB - OK OK // const RTint _NUM_OF_MEMORYBLOCKS = 65536 - 8192; // 1.75GB - OK Failed on 6996 mem blocks const RTint _NUM_OF_MEMORYBLOCKS = 65536; // 2.00GB - Failed on 929 mem blocks
// The memory manager cannot access // sufficient memory to initialize; /* exiting Notes: These results are for the Cases 3 and 4 ( Win32 / Release configuration ) Total amount of memory that could be allocated with CRT 'malloc' ~1.97GB Total amount of memory that could be allocated with TBB 'scalable_allocator' ~1.54GB ( ~0.43GB less ) */ #if ( defined ( _TEST_CRTMALLOC ) || defined ( _TEST_TBBSCALABLEALLOCATOR ) ) CrtPrintf( RTU("Number of Memory Blocks: %ldn"), ( RTint )_NUM_OF_MEMORYBLOCKS ); CrtPrintf( RTU("Size of Memory Block : %ld bytesn"), ( RTint )( _SIZE_OF_MEMBLOCK * sizeof( RTfloat ) ) ); CrtPrintf( RTU("Total Amount of Memory : %.2f GBnn"), ( RTfloat )( _SIZE_OF_MEMBLOCK * sizeof( RTfloat ) * _NUM_OF_MEMORYBLOCKS ) / 1024 / 1024 / 1024 ); #endif RTfloat *pfData[ _NUM_OF_MEMORYBLOCKS ] = { RTnull }; RTbool bErrorM = RTfalse; RTbool bErrorS = RTfalse; RTuint uiNumOfMemBlocksNotAllocatedM = 0U; RTuint uiNumOfMemBlocksNotAllocatedS = 0U; RTuint uiSysErrorM = 0U; RTuint uiSysErrorS = 0U; RTint t; while( RTtrue ) { #ifdef _TEST_CRTMALLOC // Case 1 - CRT malloc g_uiTicksStart = SysGetTickCount(); for( t = 0; t < _NUM_OF_MEMORYBLOCKS; t++ ) { pfData
[/cpp]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Some small modifications in the source codes of the test will be needed, like:
CrtPrintf-> _tprintf or printf
CrtMalloc-> malloc
CrtFree-> free
CrtGetChar-> _gettchar or getchar
SysGetTickCount-> GetTickCount
RTtrue-> true or TRUE
RTfalse-> false or FALSE
RTnull-> NULL
RTbool-> bool or BOOL
RTint-> int
RTuint-> unsigned int
RTfloat-> float
RTU-> _T
or use a set of macros, like:
...
#define CrtPrintf _tprintf
...
Two global variables 'g_uiTicksStart' and 'g_uiTicksEnd' are declared as follows:
...
RTuint g_uiTicksStart = 0U;
RTuint g_uiTicksEnd = 0U;
...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here are updated test results for the Case 4 when both configuration macros are defined:
...
#define _TEST_CRTMALLOC
#define _TEST_TBBSCALABLEALLOCATOR
...
>> Test 1 <<
[cpp] Number of Memory Blocks: 4192 Size of Memory Block : 32768 bytes Total Amount of Memory : 0.13 GB [ CRT malloc ] All memory blocks are allocated - 32 ticks [ CRT free ] All memory blocks are released - 15 ticks Press ENTER to continue... [ TBB scalable_allocator ] All memory blocks are allocated - 31 ticks [ TBB deallocate ] All memory blocks are released - 0 ticks Press ENTER to exit... [/cpp]
>> Test 2 <<
>> Test 3 <<
>> Test 4 <<
>> Test 5 <<
>> Test 6 <<
>> Test 7 <<
My Development Environment:
OS : Windows XP 32-bit SP3
IDE: Visual Studio 2005 SP1
TBB: Version 4 Update 3
TBB: Version 4 Update 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I decided tostress-test a CRT 'malloc' function again. I wanted to understand if itwould experience a problem
similar to TBB 'scalable_allocator'. Here is output for 3 tests with CRT 'malloc' & 'free' functions executed one after another:
[bash] ...
Number of Memory Blocks: 65536 Size of Memory Block : 32768 bytes Total Amount of Memory : 2.00 GB // Sub-Test #1 [ CRT malloc ] All memory blocks are allocated - 890 ticks [ CRT free ] All memory blocks are released - 235 ticks [ CRT malloc ] Failed to allocate 929 memory blocks Press ENTER to continue... // Sub-Test #2 [ CRT malloc ] All memory blocks are allocated - 687 ticks [ CRT free ] All memory blocks are released - 235 ticks [ CRT malloc ] Failed to allocate 931 memory blocks Press ENTER to continue... // Sub-Test #3 [ CRT malloc ] All memory blocks are allocated - 688 ticks [ CRT free ] All memory blocks are released - 250 ticks [ CRT malloc ] Failed to allocate 929 memory blocks Press ENTER to continue... ... [/bash]
As you can see CRT 'malloc' worked well andallocated all available memory for a 32-bit test application:
in 'Sub-Test #2' after it was released in 'Sub-Test #1'
and allocated all available memory
in 'Sub-Test #3' after it was released in 'Sub-Test #2'.
A screenshot is enclosed:
'Sub-Test #2' and 'Sub-Test #3' allocated all available memory faster then 'Sub-Test #1' in ~1.30 times and it isexpected.
Note: Pillars are different because the Windows Task Manager was lagging when rendering graphics during the test.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[cpp] ... Number of Memory Blocks: 65536 Size of Memory Block : 32768 bytes Total Amount of Memory : 2.00 GB // Sub-Test #1 [ CRT malloc ] All memory blocks are allocated - 1546 ticks [ CRT free ] All memory blocks are released - 250 ticks [ CRT malloc ] Failed to allocate 929 memory blocks Press ENTER to continue... // Sub-Test #2 [ CRT malloc ] All memory blocks are allocated - 672 ticks [ CRT free ] All memory blocks are released - 234 ticks [ CRT malloc ] Failed to allocate 931 memory blocks Press ENTER to continue... // Sub-Test #3 [ CRT malloc ] All memory blocks are allocated - 672 ticks [ CRT free ] All memory blocks are released - 235 ticks [ CRT malloc ] Failed to allocate 929 memory blocks Press ENTER to continue... // Sub-Test #4 [ CRT malloc ] All memory blocks are allocated - 672 ticks [ CRT free ] All memory blocks are released - 234 ticks [ CRT malloc ] Failed to allocate 931 memory blocks Press ENTER to continue... // Sub-Test #5 - TBB 'scalable_allocator' The memory manager cannot access sufficient memory to initialize; exiting ... [/cpp]
A screenshot is enclosed:

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sergey,
Thank you for the report! I able to reproduce the issue locally, and belive it's 3rd party problem. I.e., it seems allocator from Microsoft Visual Studio failed to de-fragment memory when it got out of memory condition. As result, after system allocator failed and despite it released all the memory, subsequent allocation of 2MB via malloc or VirtualAlloc failed, but this is how TBB allocator finds memory to work with.
We are thinking about possible workarounds.
[cpp]#include- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, Alexandr! Did you do the investigation with the latest version of TBB v4 Update 5? Please confirm me.
I'll domy own investigation because I believe that there is a problem with TBB. I'll report my results as soon as
investigationis completed.
I'd like to note thatI'm still using TBB v4 Update 3.
Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, Alexandr! Did you do the investigation with the latest version of TBB v4 Update 5? Please confirm me.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[SergeyK] Alexander, Did you debug tbbmalloc_debug.dll? Since my investigation is already in progress
I hold a neutral position and I don't blame any side until the investigation is completed.
Please take a look atmy next posts.
I.e., it seems allocator from Microsoft Visual Studio failed to de-fragment memory when it got out of memory condition.
[SergeyK] "Allocator" fromMicrosoft Visual Studio is not responsible for defragmentation of memory on any Windows platforms.
It is a responsibility of aVirtual Memory Manager ( VMM ). Please take a look at MSDN topic:
'The Virtual-Memory Manager in Windows NT'
So, first of all about your test-case. Youmodified / simplifiedmy2nd version that I haveposted ( see Post #4 ):
http://software.intel.com/en-us/forums/showpost.php?p=191122
At the beginning I had a processing until 1st error and there were'break' statementsinside of all 'for'-loops.
As you can see now I changed it and replaced all 'break' statements with 'continue' statements. It allowed to see
how many memory blocks 'malloc' or 'scalable_allocator' could not allocate.
Your "isolated" test-case reproduces all my numbers but in a different way ( see Post #6 ):
http://software.intel.com/en-us/forums/showpost.php?p=191124
Source codes of your modified test-case provided:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
http://software.intel.com/en-us/forums/showpost.php?p=191610
[cpp]... // Sub-Test 10 - Stress Tests for CRT 'malloc' const size_t _SIZE_OF_MEMBLOCK = ( 8192 * sizeof( RTfloat ) ); const size_t _NUM_OF_MEMORYBLOCKS = 65536; RTfloat *pfData[ _NUM_OF_MEMORYBLOCKS ] = { RTnull }; // Sub-Test 1 CrtPrintf( RTU("Sub-Test 1n") ); for( size_t i = 0; i < _NUM_OF_MEMORYBLOCKS; i++ ) { pfData = ( RTfloat * )CrtMalloc( _SIZE_OF_MEMBLOCK ); if( pfData == RTnull ) { CrtPrintf( RTU("Allocated %ld Memory Blocks. Not Allocated %ld Memory Blocksn"), i, ( _NUM_OF_MEMORYBLOCKS - i ) ); break; } } for( size_t i = 0; i < _NUM_OF_MEMORYBLOCKS; i++ ) { if( pfData == RTnull ) break; CrtFree( pfData ); } CrtPrintf( RTU("All Memory Released - Press ENTER to continue...n") ); CrtGetChar(); // Sub-Test 2 CrtPrintf( RTU("Sub-Test 2n") ); for( size_t i = 0; i < _NUM_OF_MEMORYBLOCKS; i++ ) { pfData = ( RTfloat * )CrtMalloc( _SIZE_OF_MEMBLOCK ); if( pfData == RTnull ) { CrtPrintf( RTU("Allocated %ld Memory Blocks. Not Allocated %ld Memory Blocksn"), i, ( _NUM_OF_MEMORYBLOCKS - i ) ); break; } } for( size_t i = 0; i < _NUM_OF_MEMORYBLOCKS; i++ ) { if( pfData == RTnull ) break; CrtFree( pfData ); } CrtPrintf( RTU("All Memory Released - Press ENTER to continue...n") ); CrtGetChar(); // Sub-Test 3 CrtPrintf( RTU("Sub-Test 3n") ); for( size_t i = 0; i < _NUM_OF_MEMORYBLOCKS; i++ ) { pfData = ( RTfloat * )CrtMalloc( _SIZE_OF_MEMBLOCK ); if( pfData == RTnull ) { CrtPrintf( RTU("Allocated %ld Memory Blocks. Not Allocated %ld Memory Blocksn"), i, ( _NUM_OF_MEMORYBLOCKS - i ) ); break; } } for( size_t i = 0; i < _NUM_OF_MEMORYBLOCKS; i++ ) { if( pfData == RTnull ) break; CrtFree( pfData ); } CrtPrintf( RTU("All Memory Released - Press ENTER to exit...n") ); CrtGetChar(); ...
[/cpp]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Once again, 'malloc' allowed to allocate ( 3 times )all availablememory ( ~1.97GB ) on a 32-bit Windows platform:
allocated -> released
allocated -> released
allocated -> released
without any errors.
Now, you mentioned some "workaround" in one of your posts. What did you mean?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[cpp]... const RTint _NUM_OF_MEMORYBLOCKS = 1; //32768 B // const RTint _NUM_OF_MEMORYBLOCKS = 16; // 0.5KB // const RTint _NUM_OF_MEMORYBLOCKS = 32; // 1KB // const RTint _NUM_OF_MEMORYBLOCKS = 64; // 2KB // const RTint _NUM_OF_MEMORYBLOCKS = 128; // 4KB // const RTint _NUM_OF_MEMORYBLOCKS = 256; // 8KB // const RTint _NUM_OF_MEMORYBLOCKS = 512; // 16KB // const RTint _NUM_OF_MEMORYBLOCKS = 1024; // 32KB // const RTint _NUM_OF_MEMORYBLOCKS = 2048; // 64KB // const RTint _NUM_OF_MEMORYBLOCKS = 4192; // 128KB // const RTint _NUM_OF_MEMORYBLOCKS = 8192; // 256KB // const RTint _NUM_OF_MEMORYBLOCKS = 16384; // 512KB // const RTint _NUM_OF_MEMORYBLOCKS = 32768; // 1.00GB // const RTint _NUM_OF_MEMORYBLOCKS = 65536 - 16384; // 1.50GB - OK // const RTint _NUM_OF_MEMORYBLOCKS = 65536 - 8192; // 1.75GB - OK // const RTint _NUM_OF_MEMORYBLOCKS = 65536; // 2.00GB ... [/cpp]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- ...
- constRTint_NUM_OF_MEMORYBLOCKS=1;//32768B
If that case is selected than amount of total memory allocated in GB is displayd as '0.00 GB'. Please don't pay attention because
this is aformatting issue of a'printf' CRT function used in the test-case.
This is how it looks like:
...
Number of Memory Blocks: 1
Size of Memory Block : 32768 bytes
Total Amount of Memory : 0.00 GB
[ CRT malloc ] All memory blocks are allocated - 0 ticks
[ CRT free ] All memory blocks are released - 0 ticks
Press ENTER to continue...
...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Once again, 'malloc' allowed to allocate ( 3 times )all availablememory ( ~1.97GB ) on a 32-bit Windows platform:
allocated -> released
allocated -> released
allocated -> released
without any errors.
Now, you mentioned some "workaround" in one of your posts. What did you mean?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I mean that when system allocators heap became fragmented and we cant allocate an object with size we want (say, 2MB), we still can allocate a smaller object that might be enough for TBB allocator bootstrap and some limited operations. I dont know is it useful for real-world applications or not.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
limit of the memory showed? It looks strange...
[SergeyK] Absolutely agree.
I'm not sure for 100% but it looks like a rendering issue of the Windows Task Manager. Please take a look at
my primary test-case for the problem and you will see that there is no pausebetween a sub-test that allocates
memory blocksand a sub-test that releases memory blocks. I would add a call to 'Sleep' Win32 API function
with a delay at least 1 second.
In a pseudo-code it would look like:
...
//Allocation of memory blocks
...
::Sleep( 1000 ); // Delay to allow the Windows Task Manager to display a graph properly
//Release of memory blocks
...
Best regards,
Sergey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page