- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It turns out that because my machine is quad-core with HT enabled, IPP was defaulting to 8 threads, and the call to ippiResizeGetBufSize returns a value that is 8 times the value needed for single-threaded mode, which caused an 32-bit integer wrap-around, resulting in a value that falls short of the memory needed for ippiResizeSqrPixel_8u_C1R.
(When single-thread is specified, the buffer size is606766416. When the default number of threads is used, e.g. 8, the buffer size is 0x121542A80, which is bigger than 32-bit and was wrapped around to559164032 when returned by ippiResizeGetBufSize.)
[cpp]// ippiResizeCrash.cpp : Defines the entry point for the console application.[/cpp]
[cpp]#include "stdafx.h" #include "ipp.h" #include#include #include #pragma comment(lib,"ippcore.lib") #pragma comment(lib,"ipps.lib") #pragma comment(lib,"ippi.lib") int _tmain(int argc, _TCHAR* argv[]) { IppStatus st = ippStsNoErr; st = ippInit(); if (st != ippStsNoErr) { std::cout << "ippInit" << std::endl; throw std::exception(); } // No crash if numThreads set to 1. st = ippSetNumThreads(3); if (st != ippStsNoErr) { std::cout << "ippSetNumThreads" << std::endl; throw std::exception(); } const int sourceWidth = 15440; const int sourceHeight = 9813; const int destWidth = sourceWidth / 2; const int destHeight = sourceHeight / 2; const int numChannels = 1; const int sourceStep = sourceWidth; const int destStep = destWidth; Ipp8u* srcData = (Ipp8u*)ippMalloc(sourceWidth * sourceHeight); Ipp8u* destData = (Ipp8u*)ippMalloc(destWidth * destHeight); if (!srcData || !destData) { std::cout << "ippMalloc (source, dest)" << std::endl; throw std::exception(); } IppiSize srcSize = {0}; srcSize.width = sourceWidth; srcSize.height = sourceHeight; IppiRect srcRoi = {0}; srcRoi.x = 0; srcRoi.y = 0; srcRoi.width = sourceWidth; srcRoi.height = sourceHeight; IppiRect destRoi = {0}; destRoi.x = 0; destRoi.y = 0; destRoi.width = destWidth; destRoi.height = destHeight; double xFactor = (double)destWidth / (double)sourceWidth; double yFactor = (double)destHeight / (double)sourceHeight; double xShift = 0.0; double yShift = 0.0; int interpolation = IPPI_INTER_LINEAR | IPPI_ANTIALIASING; int bufferSize = 0; st = ippiResizeGetBufSize(srcRoi, destRoi, numChannels, interpolation, &bufferSize); if (st != ippStsNoErr) { std::cout << "ippiResizeGetBufSize" << std::endl; throw std::exception(); } Ipp8u* buffer = (Ipp8u*)ippMalloc(bufferSize); if (!buffer) { std::cout << "ippMalloc (buffer)" << std::endl; throw std::exception(); } for (int k = 0; k < 5; ++k) { st = ippiResizeSqrPixel_8u_C1R((const Ipp8u*)srcData, srcSize, sourceStep, srcRoi, (Ipp8u*)destData, destStep, destRoi, xFactor, yFactor, xShift, yShift, interpolation, buffer); if (st != ippStsNoErr) { std::cout << "ippiResizeSqrPixel_8u_C1R" << std::endl; throw std::exception(); } } ippFree(srcData); ippFree(destData); ippFree(buffer); ::Sleep(1000); return 0; } [/cpp]
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>When single-thread is specified, the buffer size is606766416. When the default number of threads is
>>used, e.g. 8, the buffer size is 0x121542A80, which is bigger than 32-bit...
>>...
Hi Ryan,
On32-bit Windows platforms an application can not allocate more than 2GB of memory.A 0x121542A80
( ~4.5GB ) number isgreater than 2GB.
Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You could try again with Resize() or ResizeCenter() to see if that can handle your case.
Those functions are marked as depreciated, so it is only a test...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a similar use case -- need to resize a large image (7000 x 7000) to a smaller sized scale (typically 65-50% the original size). If I resize to below 5%, the attached code below doesn't crash. But if scaling is set to 0.5, it crashes every time on the call to ippiResizeSqrPixel_8u_C1R(). I've noticed increasing the image size (anything above 8196x8196) results in ippiResizeGetBufSize() overflowing and returning a very large negative number. Even in 64-bit mode, I can't allocate that much memory ;)
One thing that occurred to me to ask is if anyone knows if ippiResizeSqrPixel_8u_C1R requires any padding? Many of the other image kernels (EG: erosion/dilation) require padding of some size to work properly. The docs don't seem to indicate any padding is necessary, but this wouldn't be the first time there was insufficient documentation for an IPP function...
Here's what I'm doing (very similar to the above snippit):
[cpp]#include#include IppStatus ResizeSqrPixel( void ) { const int NUMBER_ROWS = 7000; const int NUMBER_COLS = 7000; double scale = 0.5; int scaledCols = (int)(scale*NUMBER_COLS + 0.5); int scaledRows = (int)(scale*NUMBER_ROWS + 0.5); IppiSize srcSize = {NUMBER_ROWS, NUMBER_COLS}; IppiSize destSize = {scaledRows,scaledCols}; IppiRect srcRect = {0,0,NUMBER_ROWS, NUMBER_COLS}; IppiRect destRect = {0,0,scaledRows,scaledCols}; Ipp8u *scratchBuf = NULL; int bufsize = 0; IppStatus status = ippStsNoErr; int sourceStep = 0; Ipp8u* src = ippiMalloc_8u_C1(NUMBER_ROWS, NUMBER_COLS, &sourceStep); int destStep = 0; Ipp8u* dst = ippiMalloc_8u_C1(scaledRows, scaledCols, &destStep); // initialize source and dest status = ippiSet_8u_C1R( 1, src, sourceStep, srcSize ); status = ippiSet_8u_C1R( 0, dst, destStep, destSize ); static const int IPP_INPUT_IMAGE_RESIZE_MODE = IPPI_INTER_LINEAR | IPPI_ANTIALIASING; status = ippiResizeGetBufSize( srcRect, destRect, 1, IPP_INPUT_IMAGE_RESIZE_MODE, &bufsize ); scratchBuf = ippsMalloc_8u( bufsize ); if( NULL != scratchBuf ) { status = ippiResizeSqrPixel_8u_C1R(src, srcSize, sourceStep, srcRect, dst, destStep, destRect, scale, scale, 0, 0, IPP_INPUT_IMAGE_RESIZE_MODE, scratchBuf ); } else { std::cerr << "Malloc failed, check bufsize of " << bufsize; return ippStsMemAllocErr; } if( NULL != scratchBuf ) { ippsFree( scratchBuf ); } return status; } int main(int argc, char* argv[]) { ResizeSqrPixel(); return 0; }[/cpp]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I use 'ippiResize_8u_C1R' instead ona computer with one CPUwithone IPP threadand it works for
sizes up to 34207 x 34207.
Somethinggoes wrongif I increase the size to 34208 x 34208, that is, an Access Violation happens.
Here is an example of an output:
Application - ScaLibTestApp - WIN32_MSC
Tests: Start
> Test1147 Start <
Intel IPP Library Support Enabled
Sub-Test 07
Source Image size : 32768 x 32768
Destination Image size: 16384 x 16384
Memory for a Source Image is allocated
Memory for a Destination Image is allocated
[ ippiSet_8u_C1R ] Completed
Resizing the Source Image...
[ ippiResize_8u_C1R ] Completed
[ ippiFree ] for the Source Image Completed
[ ippiFree ] for theDestination Image Completed
> Test1147 End <
Tests: Completed
Memory Blocks Allocated : 0
Memory Blocks Released : 0
Memory Blocks NOT Released: 0
Memory Tracer Integrity Verified - Memory Leaks NOT Detected
Deallocating Memory Tracer Data Table
Completed
Press any key to continue . . .
PS: I'll provide a screenshot of the Windows Task Manager later
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Note: An error proccesing is very limited!
// Sub-Test 07 - Tests for 'ippiResize_8u_C1R' function
{
///*
#if defined ( _WIN32_MSC )
CrtPrintf( RTU("Sub-Test 07\n") ); // Max Image size is34207 x 34207
IppStatus st = ippStsNoErr;
RTint iSrcWidth = 32768 / 1;
RTint iSrcHeight = 32768 / 1;
RTint iDstWidth = iSrcWidth / 2;
RTint iDstHeight = iSrcHeight / 2;
CrtPrintf( RTU("Source Image size: %5ld x %5ld\n"), iSrcWidth, iSrcHeight );
CrtPrintf( RTU("Destination Image size: %5ld x %5ld\n"), iDstWidth, iDstHeight );
RTint iSrcStepBytes = -1;
RTint iDstStepBytes = -1;
Ipp8u *pSrcImage = ( Ipp8u * )::ippiMalloc_8u_C1( iSrcWidth, iSrcHeight, &iSrcStepBytes );
if( pSrcImage == RTnull )
CrtPrintf( RTU("Memory for a Source Image is NOT allocated\n") );
else
CrtPrintf( RTU("Memory for a Source Image is allocated\n") );
Ipp8u *pDstImage = ( Ipp8u * )::ippiMalloc_8u_C1( iDstWidth + 0, iDstHeight + 0, &iDstStepBytes );
if( pDstImage == RTnull )
CrtPrintf( RTU("Memory for a Destination Image is NOT allocated\n") );
else
CrtPrintf( RTU("Memory for a Destination Image is allocated\n") );
IppiSize srcSize = { 0 };
srcSize.width = iSrcWidth * iSrcHeight;
srcSize.height = 1;
st = ::ippiSet_8u_C1R( 1, pSrcImage, 1, srcSize );
if( st != ippStsNoErr )
CrtPrintf( RTU("[ ippiSet_8u_C1R ] Failed\n") );
else
CrtPrintf( RTU("[ ippiSet_8u_C1R ] Completed\n") );
srcSize.width = iSrcWidth;
srcSize.height = iSrcHeight;
IppiRect srcRoi = { 0 };
srcRoi.x = 0;
srcRoi.y = 0;
srcRoi.width = iSrcWidth;
srcRoi.height = iSrcHeight;
IppiSize dstSize = { 0 };
dstSize.width = iDstWidth;
dstSize.height = iDstHeight;
RTdouble dFactorX = ( RTdouble )iDstWidth / ( RTdouble )iSrcWidth;
RTdouble dFactorY = ( RTdouble )iDstHeight / ( RTdouble )iSrcHeight;
RTint iInterpolation = IPPI_INTER_LINEAR;
CrtPrintf( RTU("Resizing the Source Image...\n") );
st = ::ippiResize_8u_C1R( ( const Ipp8u * )pSrcImage, srcSize, iSrcStepBytes, srcRoi,
( Ipp8u * )pDstImage, iDstStepBytes, dstSize,
dFactorX, dFactorY, iInterpolation );
if( st != ippStsNoErr )
CrtPrintf( RTU("[ ippiResize_8u_C1R ] Failed\n") );
else
CrtPrintf( RTU("[ ippiResize_8u_C1R ] Completed\n") );
if( pSrcImage != RTnull )
{
::ippiFree( pSrcImage );
CrtPrintf( RTU("[ ippiFree ] for the Source Image Completed\n") );
}
if( pDstImage != RTnull )
{
::ippiFree( pDstImage );
CrtPrintf( RTU("[ ippiFree ] for the Destination Image Completed\n") );
}
pSrcImage = RTnull;
pDstImage = RTnull;
#endif
//*/
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the quick response. I was able to substitute ippiResize_8u_C1R for ippiResizeSqrPixel_8u_C1R and get the desired results without crashing. However, I'm greeted with a warning that ippiResize_8u_C1R is deprecated, and we should be using ippiResizeSqrPixel_8u_C1R.
I've disabled this warning with a pragma, but in the interest of making my code compatable with future versions of IPP, I'm uncomfortable substituting these functions for the long term.
Warning message:
1>.\IPPResizeTest.cpp(38) : warning C4996: 'ippiResize_8u_C1R': use ippiResizeSqrPixel_8u_C1R function instead of this one
1> c:\program files (x86)\intel\composerxe-2011\ipp\include\ippi.h(5131) : see declaration of 'ippiResize_8u_C1R'
Do you think Intel will fix ippiResizeSqrPixel_8u_C1R in a future IPP release?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When the anti-aliasing flag is added, the buffer size is606766416. (with numThreads = 1)
When static-linked with the non-threaded LIB, numThreads default to 1.
When dynamically-linked with the DLL, numThreads defaults to the number of virtual cores, which is 8 in my computer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can't you just drop it?
Regarding how many cores IPP uses, there are a bunch of OMP settings that can control multithreading.
Those settings are related to getting max performance when you have different Intel CPU architectures, such as:
- Two cpu chips.
- One cpu chip with two dies inside.
- One cpu chip with one die inside.
- Etc.
Those OMP flags can be set programatically and/or by setting an environment variable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Warning message:
1>.\IPPResizeTest.cpp(38) : warning C4996: 'ippiResize_8u_C1R': use ippiResizeSqrPixel_8u_C1R function instead of this one
1> c:\program files (x86)\intel\composerxe-2011\ipp\include\ippi.h(5131) : see declaration of 'ippiResize_8u_C1R'
[SergeyK] If Intel will decide to remove 'ippiResize_8u_C1R' function I will be forced touse
an older IPP version that supports it.
Do you think Intel will fix ippiResizeSqrPixel_8u_C1R in a future IPP release?
[SergeyK] I think Yes. If you really need 'ippiResizeSqrPixel_8u_C1R' function I would contact to
the Intel Premium support. It is clear that there are problems in a Multi-CPU
environment and with processing of large images. IPP team should do as better as
possible stress testing. I believe they've donestress testingbut something was missed.
Unfortunately, we don't hear anything from Intel regarding these problems.
Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When single-thread is specified, the buffer size is606766416. When the default number of threads is used, e.g. 8, the buffer size is 0x121542A80, which is bigger than 32-bit and was wrapped around
to559164032 when returned by ippiResizeGetBufSize.
...
Hi Ryan,
Did you get that number, that is0x121542A80, on a 32-bit or 64-bit Windowsplatform?
0x121542A80 = 4,854,131,328 ( greater that MAX 32-bitvalue 4,294,967,296)
...
intbufferSize=0;
st=ippiResizeGetBufSize(srcRoi,destRoi,numChannels,interpolation,&bufferSize);
...
I could assume that it happened on a 64-bit platform because only in that case 'int' is 64-bit based.
Best regards,
Sergey
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page