Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
6737 Discussions

detecting integer overflow in ippiResizeGetBufSize when resizing large images

Ryan_Wong
Beginner
730 Views
Dear all,

I was trying to resize a large 8bpp grayscale image (15440 x 9813) to half size, using IPPI_INTER_LINEAR and IPPI_ANTIALIASING flags. The application runs in 32-bit Windows.

(In the application I'm developing, it is very common task to resize large images as long as the total image size (input size plus output size) doesn't exceed the available memory. In this case, the non-IPP version of my application typically requires 15440 x 9813 + 7720 x 4906 = less than 200MB memory.)

When the program was first run, it crashes with access violation. The first version did not call ippSetNumThreads.
It turns out that because my machine is quad-core with HT enabled, IPP was defaulting to 8 threads, and the call to ippiResizeGetBufSize returns a value that is 8 times the value needed for single-threaded mode, which caused an 32-bit integer wrap-around, resulting in a value that falls short of the memory needed for ippiResizeSqrPixel_8u_C1R.

(When single-thread is specified, the buffer size is606766416. When the default number of threads is used, e.g. 8, the buffer size is 0x121542A80, which is bigger than 32-bit and was wrapped around to559164032 when returned by ippiResizeGetBufSize.)

I'm using Visual Studio, which fixes "sizeof(int) == 4" for both 32-bit and 64-bit applications.

[cpp]// ippiResizeCrash.cpp : Defines the entry point for the console application.[/cpp]
[cpp]#include "stdafx.h"
#include "ipp.h"
#include 
#include 
#include 

#pragma comment(lib,"ippcore.lib")
#pragma comment(lib,"ipps.lib")
#pragma comment(lib,"ippi.lib")

int _tmain(int argc, _TCHAR* argv[])
{
	IppStatus st = ippStsNoErr;

	st = ippInit();
	if (st != ippStsNoErr)
	{
		std::cout << "ippInit" << std::endl;
		throw std::exception();
	}

	// No crash if numThreads set to 1.
	st = ippSetNumThreads(3);
	if (st != ippStsNoErr)
	{
		std::cout << "ippSetNumThreads" << std::endl;
		throw std::exception();
	}

	const int sourceWidth = 15440;
	const int sourceHeight = 9813;
	const int destWidth = sourceWidth / 2;
	const int destHeight = sourceHeight / 2;
	const int numChannels = 1;
	
	const int sourceStep = sourceWidth;
	const int destStep = destWidth;

	Ipp8u* srcData = (Ipp8u*)ippMalloc(sourceWidth * sourceHeight);
	Ipp8u* destData = (Ipp8u*)ippMalloc(destWidth * destHeight);

	if (!srcData || !destData)
	{
		std::cout << "ippMalloc (source, dest)" << std::endl;
		throw std::exception();
	}

	IppiSize srcSize = {0};
	srcSize.width = sourceWidth;
	srcSize.height = sourceHeight;

	IppiRect srcRoi = {0};
	srcRoi.x = 0;
	srcRoi.y = 0;
	srcRoi.width = sourceWidth;
	srcRoi.height = sourceHeight;

	IppiRect destRoi = {0};
	destRoi.x = 0;
	destRoi.y = 0;
	destRoi.width = destWidth;
	destRoi.height = destHeight;

	double xFactor = (double)destWidth / (double)sourceWidth;
	double yFactor = (double)destHeight / (double)sourceHeight;
	double xShift = 0.0;
	double yShift = 0.0;

	int interpolation = IPPI_INTER_LINEAR | IPPI_ANTIALIASING;

	int bufferSize = 0;
	st = ippiResizeGetBufSize(srcRoi, destRoi, numChannels, interpolation, &bufferSize);
	if (st != ippStsNoErr)
	{
		std::cout << "ippiResizeGetBufSize" << std::endl;
		throw std::exception();
	}
	
	Ipp8u* buffer = (Ipp8u*)ippMalloc(bufferSize);
	if (!buffer)
	{
		std::cout << "ippMalloc (buffer)" << std::endl;
		throw std::exception();
	}

	for (int k = 0; k < 5; ++k)
	{
		st = ippiResizeSqrPixel_8u_C1R((const Ipp8u*)srcData, srcSize, sourceStep, srcRoi,
			(Ipp8u*)destData, destStep, destRoi, 
			xFactor, yFactor, xShift, yShift, interpolation, buffer);
		if (st != ippStsNoErr)
		{
			std::cout << "ippiResizeSqrPixel_8u_C1R" << std::endl;
			throw std::exception();
		}
	}
	ippFree(srcData);
	ippFree(destData);
	ippFree(buffer);

	::Sleep(1000);
	return 0;
}
[/cpp]
0 Kudos
10 Replies
SergeyKostrov
Valued Contributor II
730 Views
>>...
>>When single-thread is specified, the buffer size is606766416. When the default number of threads is
>>used, e.g. 8, the buffer size is 0x121542A80, which is bigger than 32-bit...
>>...

Hi Ryan,

On32-bit Windows platforms an application can not allocate more than 2GB of memory.A 0x121542A80
( ~4.5GB ) number isgreater than 2GB.

Best regards,
Sergey
0 Kudos
Thomas_Jensen1
Beginner
730 Views
I'm surprised IPP requires such a large buffer for this case.

You could try again with Resize() or ResizeCenter() to see if that can handle your case.
Those functions are marked as depreciated, so it is only a test...

0 Kudos
Frank_S
Beginner
730 Views
I'm actually having a very similar problem when compiling against the 64-bit builds (v7.0.4.196) and was about to post a thread here.

I have a similar use case -- need to resize a large image (7000 x 7000) to a smaller sized scale (typically 65-50% the original size). If I resize to below 5%, the attached code below doesn't crash. But if scaling is set to 0.5, it crashes every time on the call to ippiResizeSqrPixel_8u_C1R(). I've noticed increasing the image size (anything above 8196x8196) results in ippiResizeGetBufSize() overflowing and returning a very large negative number. Even in 64-bit mode, I can't allocate that much memory ;)

One thing that occurred to me to ask is if anyone knows if ippiResizeSqrPixel_8u_C1R requires any padding? Many of the other image kernels (EG: erosion/dilation) require padding of some size to work properly. The docs don't seem to indicate any padding is necessary, but this wouldn't be the first time there was insufficient documentation for an IPP function...

Here's what I'm doing (very similar to the above snippit):

[cpp]#include 
#include 

IppStatus ResizeSqrPixel( void )
{
    const int NUMBER_ROWS = 7000;
    const int NUMBER_COLS = 7000;
    
    double scale = 0.5;
    int scaledCols = (int)(scale*NUMBER_COLS + 0.5);
    int scaledRows = (int)(scale*NUMBER_ROWS + 0.5);
    IppiSize srcSize = {NUMBER_ROWS, NUMBER_COLS};
    IppiSize destSize = {scaledRows,scaledCols};
    IppiRect srcRect = {0,0,NUMBER_ROWS, NUMBER_COLS};
    IppiRect destRect = {0,0,scaledRows,scaledCols};
    Ipp8u *scratchBuf = NULL;
    int bufsize = 0;
    IppStatus status = ippStsNoErr;

    int sourceStep = 0;
    Ipp8u* src = ippiMalloc_8u_C1(NUMBER_ROWS, NUMBER_COLS, &sourceStep);

    int destStep = 0;
    Ipp8u* dst = ippiMalloc_8u_C1(scaledRows, scaledCols, &destStep);

    // initialize source and dest
    status = ippiSet_8u_C1R( 1, src, sourceStep, srcSize );
    status = ippiSet_8u_C1R( 0, dst, destStep, destSize );

    static const int IPP_INPUT_IMAGE_RESIZE_MODE = IPPI_INTER_LINEAR | IPPI_ANTIALIASING;

    status = ippiResizeGetBufSize( srcRect,
                                   destRect, 
                                   1, 
                                   IPP_INPUT_IMAGE_RESIZE_MODE, 
                                   &bufsize );

    scratchBuf = ippsMalloc_8u( bufsize );
    if( NULL != scratchBuf )
    {
        status = ippiResizeSqrPixel_8u_C1R(src, srcSize, sourceStep, srcRect,
                                           dst, destStep, destRect,
                                           scale,
                                           scale,
                                           0,
                                           0,
                                           IPP_INPUT_IMAGE_RESIZE_MODE, 
                                           scratchBuf );
    }
    else
    {
        std::cerr << "Malloc failed, check bufsize of " << bufsize;
        return ippStsMemAllocErr;
    }

    if( NULL != scratchBuf )
    {
        ippsFree( scratchBuf );
    }
    return status;
}

int main(int argc, char* argv[])
{
    ResizeSqrPixel();
	return 0;
}[/cpp]

0 Kudos
SergeyKostrov
Valued Contributor II
730 Views
Hi Frank,

I use 'ippiResize_8u_C1R' instead ona computer with one CPUwithone IPP threadand it works for
sizes up to 34207 x 34207.

Somethinggoes wrongif I increase the size to 34208 x 34208, that is, an Access Violation happens.

Here is an example of an output:

Application - ScaLibTestApp - WIN32_MSC
Tests: Start
> Test1147 Start <
Intel IPP Library Support Enabled
Sub-Test 07
Source Image size : 32768 x 32768
Destination Image size: 16384 x 16384
Memory for a Source Image is allocated
Memory for a Destination Image is allocated
[ ippiSet_8u_C1R ] Completed
Resizing the Source Image...
[ ippiResize_8u_C1R ] Completed
[ ippiFree ] for the Source Image Completed
[ ippiFree ] for theDestination Image Completed
> Test1147 End <
Tests: Completed
Memory Blocks Allocated : 0
Memory Blocks Released : 0
Memory Blocks NOT Released: 0
Memory Tracer Integrity Verified - Memory Leaks NOT Detected

Deallocating Memory Tracer Data Table
Completed
Press any key to continue . . .

PS: I'll provide a screenshot of the Windows Task Manager later

0 Kudos
SergeyKostrov
Valued Contributor II
730 Views
Here is the source code of my Test-Case:


Note: An error proccesing is very limited!

// Sub-Test 07 - Tests for 'ippiResize_8u_C1R' function
{
///*
#if defined ( _WIN32_MSC )

CrtPrintf( RTU("Sub-Test 07\n") ); // Max Image size is34207 x 34207

IppStatus st = ippStsNoErr;
RTint iSrcWidth = 32768 / 1;
RTint iSrcHeight = 32768 / 1;
RTint iDstWidth = iSrcWidth / 2;
RTint iDstHeight = iSrcHeight / 2;

CrtPrintf( RTU("Source Image size: %5ld x %5ld\n"), iSrcWidth, iSrcHeight );
CrtPrintf( RTU("Destination Image size: %5ld x %5ld\n"), iDstWidth, iDstHeight );

RTint iSrcStepBytes = -1;
RTint iDstStepBytes = -1;

Ipp8u *pSrcImage = ( Ipp8u * )::ippiMalloc_8u_C1( iSrcWidth, iSrcHeight, &iSrcStepBytes );
if( pSrcImage == RTnull )
CrtPrintf( RTU("Memory for a Source Image is NOT allocated\n") );
else
CrtPrintf( RTU("Memory for a Source Image is allocated\n") );

Ipp8u *pDstImage = ( Ipp8u * )::ippiMalloc_8u_C1( iDstWidth + 0, iDstHeight + 0, &iDstStepBytes );
if( pDstImage == RTnull )
CrtPrintf( RTU("Memory for a Destination Image is NOT allocated\n") );
else
CrtPrintf( RTU("Memory for a Destination Image is allocated\n") );

IppiSize srcSize = { 0 };
srcSize.width = iSrcWidth * iSrcHeight;
srcSize.height = 1;

st = ::ippiSet_8u_C1R( 1, pSrcImage, 1, srcSize );
if( st != ippStsNoErr )
CrtPrintf( RTU("[ ippiSet_8u_C1R ] Failed\n") );
else
CrtPrintf( RTU("[ ippiSet_8u_C1R ] Completed\n") );

srcSize.width = iSrcWidth;
srcSize.height = iSrcHeight;

IppiRect srcRoi = { 0 };
srcRoi.x = 0;
srcRoi.y = 0;
srcRoi.width = iSrcWidth;
srcRoi.height = iSrcHeight;

IppiSize dstSize = { 0 };
dstSize.width = iDstWidth;
dstSize.height = iDstHeight;

RTdouble dFactorX = ( RTdouble )iDstWidth / ( RTdouble )iSrcWidth;
RTdouble dFactorY = ( RTdouble )iDstHeight / ( RTdouble )iSrcHeight;

RTint iInterpolation = IPPI_INTER_LINEAR;

CrtPrintf( RTU("Resizing the Source Image...\n") );

st = ::ippiResize_8u_C1R( ( const Ipp8u * )pSrcImage, srcSize, iSrcStepBytes, srcRoi,
( Ipp8u * )pDstImage, iDstStepBytes, dstSize,
dFactorX, dFactorY, iInterpolation );
if( st != ippStsNoErr )
CrtPrintf( RTU("[ ippiResize_8u_C1R ] Failed\n") );
else
CrtPrintf( RTU("[ ippiResize_8u_C1R ] Completed\n") );

if( pSrcImage != RTnull )
{
::ippiFree( pSrcImage );
CrtPrintf( RTU("[ ippiFree ] for the Source Image Completed\n") );
}

if( pDstImage != RTnull )
{
::ippiFree( pDstImage );
CrtPrintf( RTU("[ ippiFree ] for the Destination Image Completed\n") );
}

pSrcImage = RTnull;
pDstImage = RTnull;

#endif
//*/
}

0 Kudos
Frank_S
Beginner
730 Views
Sergey,

Thank you for the quick response. I was able to substitute ippiResize_8u_C1R for ippiResizeSqrPixel_8u_C1R and get the desired results without crashing. However, I'm greeted with a warning that ippiResize_8u_C1R is deprecated, and we should be using ippiResizeSqrPixel_8u_C1R.

I've disabled this warning with a pragma, but in the interest of making my code compatable with future versions of IPP, I'm uncomfortable substituting these functions for the long term.

Warning message:

1>.\IPPResizeTest.cpp(38) : warning C4996: 'ippiResize_8u_C1R': use ippiResizeSqrPixel_8u_C1R function instead of this one
1> c:\program files (x86)\intel\composerxe-2011\ipp\include\ippi.h(5131) : see declaration of 'ippiResize_8u_C1R'


Do you think Intel will fix ippiResizeSqrPixel_8u_C1R in a future IPP release?
0 Kudos
Ryan_Wong
Beginner
730 Views
It only occurs when interpolation is set toIPPI_INTER_LINEAR|IPPI_ANTIALIASING. It doesn't occur with IPPI_INTER_LINEAR alone.

When the anti-aliasing flag is added, the buffer size is606766416. (with numThreads = 1)
Without the anti-aliasing flag, the buffer size is163472. (with numThreads = 1)

When static-linked with the non-threaded LIB, numThreads default to 1.
When dynamically-linked with the DLL, numThreads defaults to the number of virtual cores, which is 8 in my computer.
0 Kudos
Thomas_Jensen1
Beginner
730 Views
If I'm not wrong, the antialiasing flag only deals with the borders of the result.
Can't you just drop it?

Regarding how many cores IPP uses, there are a bunch of OMP settings that can control multithreading.
Those settings are related to getting max performance when you have different Intel CPU architectures, such as:
- Two cpu chips.
- One cpu chip with two dies inside.
- One cpu chip with one die inside.
- Etc.

Those OMP flags can be set programatically and/or by setting an environment variable.

0 Kudos
SergeyKostrov
Valued Contributor II
730 Views
Quoting Frank S
...
Warning message:

1>.\IPPResizeTest.cpp(38) : warning C4996: 'ippiResize_8u_C1R': use ippiResizeSqrPixel_8u_C1R function instead of this one
1> c:\program files (x86)\intel\composerxe-2011\ipp\include\ippi.h(5131) : see declaration of 'ippiResize_8u_C1R'

[SergeyK] If Intel will decide to remove 'ippiResize_8u_C1R' function I will be forced touse
an older IPP version that supports it.

Do you think Intel will fix ippiResizeSqrPixel_8u_C1R in a future IPP release?

[SergeyK] I think Yes. If you really need 'ippiResizeSqrPixel_8u_C1R' function I would contact to
the Intel Premium support. It is clear that there are problems in a Multi-CPU
environment and with processing of large images. IPP team should do as better as
possible stress testing. I believe they've donestress testingbut something was missed.


Unfortunately, we don't hear anything from Intel regarding these problems.

Best regards,
Sergey

0 Kudos
SergeyKostrov
Valued Contributor II
730 Views
Quoting Ryan Wong
...
When single-thread is specified, the buffer size is606766416. When the default number of threads is used, e.g. 8, the buffer size is 0x121542A80, which is bigger than 32-bit and was wrapped around
to559164032 when returned by ippiResizeGetBufSize.
...


Hi Ryan,

Did you get that number, that is0x121542A80, on a 32-bit or 64-bit Windowsplatform?

0x121542A80 = 4,854,131,328 ( greater that MAX 32-bitvalue 4,294,967,296)

...
intbufferSize=0;
st=ippiResizeGetBufSize(srcRoi,destRoi,numChannels,interpolation,&bufferSize);
...

I could assume that it happened on a 64-bit platform because only in that case 'int' is 64-bit based.

Best regards,
Sergey

0 Kudos
Reply