ippiConv memory access problem

l-eyal · ‎02-17-2016

Hi,

I am trying for a very long time to convert code to work with non-deprecated IPP function (a Long-term hobby for this forum participants). I am having difficulties converting the old ippiConvValid_16s_C3R commands to the new ippiConv_16s_C3R it ofter crash with "Unhandeld exception: Access violation reading location 0xffffffff".

I am using Visual Strudio 2013, the code compiles with Intel Parallel Studio XE 2015 Update 4 Composer edition for C++ Windows Version 15.0.1386.12.XE, Compilation of 32 bit, with support for SSE 4.1/4.2. The code rum on Windows 7 32 bit based on i5 M520 cpu (and crashes also on Windows 8.1 64 bit with an i7 4790 processor). IPP version: 8.2.2 (r46212); ippIP SSE4.1/4.2 (p8)+; from 1 Apr 2015.

<pre brusk:cpp>
void CUnitTestsDlg::OnBnClickedConvolutionTestButton ()
{	// 15.2.16
	const IppLibraryVersion *pVer = ippiGetLibVersion ();
	BeginWaitCursor ();
	for (int i=0; i<1000; i++) {
		const int sourceWidth  = 1280;
		const int sourceHeight =  960;
		const int sourceChannels = 3;
		const int sourceDepthBits = 16 * sourceChannels;

		IppiSize imageSize;
		imageSize.width  = sourceWidth;
		imageSize.height = sourceHeight;
		int depthBytes   = sourceDepthBits / 8;
		int imageStep    = sourceWidth * depthBytes;

		IppiSize matrixSize;
		matrixSize.width  = 3;
		matrixSize.height = 3;
		int matrix1Step = matrixSize.width * depthBytes;

		std::unique_ptr <signed short> pSourceImage = std::unique_ptr <signed short> (new short[sourceWidth * sourceHeight * sourceChannels]);
		std::unique_ptr <signed short> pDestImage   = std::unique_ptr <signed short> (new short[sourceWidth * sourceHeight * sourceChannels]);

		// fill reasonable data
		short convMatrix[3][3][3];
		convMatrix[0][0][0] = convMatrix[0][0][1] = convMatrix[0][0][2] = -1;
		convMatrix[0][1][0] = convMatrix[0][1][1] = convMatrix[0][1][2] = -2;
		convMatrix[0][2][0] = convMatrix[0][2][1] = convMatrix[0][2][2] = -1;
		convMatrix[1][0][0] = convMatrix[1][0][1] = convMatrix[1][0][2] = -2;
		convMatrix[1][1][0] = convMatrix[1][1][1] = convMatrix[1][1][2] = 13;
		convMatrix[1][2][0] = convMatrix[1][2][1] = convMatrix[1][2][2] = -2;
		convMatrix[2][0][0] = convMatrix[2][0][1] = convMatrix[2][0][2] = -1;
		convMatrix[2][1][0] = convMatrix[2][1][1] = convMatrix[2][1][2] = -2;
		convMatrix[2][2][0] = convMatrix[2][2][1] = convMatrix[2][2][2] = -1;
		for (int y = 0; y < sourceHeight; y++) {
			for (int x = 0; x < sourceWidth; x++) {
				int index = y*imageStep/2 + x * 3;
				pSourceImage.get ()[index    ] = x % 256;
				pSourceImage.get ()[index + 1] = y % 256;
				pSourceImage.get ()[index + 2] = (x + y) % 256;
			}
		}

		// old commands - always works
		IppStatus retIppiStatus = ippiConvValid_16s_C3R (
			pSourceImage.get (),  imageStep,   imageSize,		// original image: source
			&convMatrix[0][0][0], matrix1Step, matrixSize,		// convolution matrix
			pDestImage.get (),    imageStep,					// result image buffer: destination
			1													// convolution divider
		);

		// new commands: Often crashes
		std::unique_ptr <unsigned char> pAssistConvolution {nullptr};
		int buffSizeValid = 0;
		IppEnum algType = (IppEnum)(ippAlgAuto | ippiROIValid | ippiNormNone);
//		IppEnum algType = (IppEnum)(ippAlgAuto | ippiROIFull | ippiNormNone);
		ippiConvGetBufferSize (imageSize, matrixSize, ipp16s, sourceChannels, algType, &buffSizeValid);
		pAssistConvolution = std::unique_ptr <unsigned char> (new unsigned char [buffSizeValid]);

		retIppiStatus = ippiConv_16s_C3R (
			pSourceImage.get (),  imageStep,   imageSize,		// original image: source
			&convMatrix[0][0][0], matrix1Step, matrixSize,		// convolution matrix
			pDestImage.get (),    imageStep,					// result image buffer: destination
			1,													// matrix divisor paramaeter
			algType,											// convolution type
			pAssistConvolution.get()							// assitance buffer
		);
	}
	EndWaitCursor ();
}

<\pre>

b_k_ · ‎02-23-2016

Hello Eyal. Is the step size for the images correct? you have: int imageStep = sourceWidth * depthBytes; But it is a three channel interlaced image. So I think the step size should be: int imageStep = sourceWidth * depthBytes * sourceChannels; By the way, why do you use c++ new and not IppiMalloc? p.s. Calling it a hobby is very very kind :).

l-eyal · ‎02-23-2016

Hi b.k.

Thanks for your reply.

The imageStep parameter is correct, as the sourceDepthBits parameter already incorporates the sourceChannels parameter and the depthBytes parameter is set correctly to 6.

Also, the old deprecated command uses the same imageStep parameter and the same buffers, and it does not fail.

Is there any benefit by using the IppiMalloc over new (and std::unique_ptr)?

Thanks,

Eyal

b_k_ · ‎03-15-2016

Hi

Sorry fo the long delay.

using ippimalloc spares you the need to calculate bytes and sizes.

For example: Ipp16s* ippiMalloc_16s_C3(int widthPixels, int heightPixels, int* pStepBytes);

Gives you the image with each row start aligned to 32 \ 64 bytes, and the stepBytes value.

Since pointer arythmetic is very common with IPP, the stepBytes parameter in ippiMalloc saves a few bugs.

Aligned rows sould make memory read\write more effective. I never noticed any difference, but mabe I just used small images (640x480).

From older posts the ippiMalloc functions are wrappers for standard malloc with some calculation of where the start point is.

You can use ippiMalloc and compare the stepBytes you get to the one you calculate yourself. They should be the same.