<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic ippiConv memory access problem in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippiConv-memory-access-problem/m-p/1077268#M24681</link>
    <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I am trying for a very long time to convert code to work with non-deprecated IPP function (a&amp;nbsp;&lt;SPAN class="hps" style="font-size: 1em; line-height: 1.5;"&gt;Long-term&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt; &lt;/SPAN&gt;hobby for this forum participants). I am having difficulties converting the old&amp;nbsp;&lt;/SPAN&gt;ippiConvValid_16s_C3R&amp;nbsp;commands to the new&amp;nbsp;ippiConv_16s_C3R it ofter crash with "Unhandeld exception: Access violation reading location 0xffffffff".&lt;/P&gt;

&lt;P&gt;I am using Visual Strudio 2013, the code compiles with Intel Parallel Studio XE 2015 Update 4 Composer edition for C++ Windows Version 15.0.1386.12.XE, Compilation of 32 bit, with support for SSE 4.1/4.2. The code rum on Windows 7 32 bit based on i5 M520 cpu (and crashes also on Windows 8.1 64 bit with an i7 4790 processor).&amp;nbsp;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;IPP version: 8.2.2 (r46212); ippIP SSE4.1/4.2 (p8)+; from 1 Apr 2015.&lt;/SPAN&gt;&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;&amp;lt;pre brusk:cpp&amp;gt;
void CUnitTestsDlg::OnBnClickedConvolutionTestButton ()
{	// 15.2.16
	const IppLibraryVersion *pVer = ippiGetLibVersion ();
	BeginWaitCursor ();
	for (int i=0; i&amp;lt;1000; i++) {
		const int sourceWidth  = 1280;
		const int sourceHeight =  960;
		const int sourceChannels = 3;
		const int sourceDepthBits = 16 * sourceChannels;

		IppiSize imageSize;
		imageSize.width  = sourceWidth;
		imageSize.height = sourceHeight;
		int depthBytes   = sourceDepthBits / 8;
		int imageStep    = sourceWidth * depthBytes;

		IppiSize matrixSize;
		matrixSize.width  = 3;
		matrixSize.height = 3;
		int matrix1Step = matrixSize.width * depthBytes;

		std::unique_ptr &amp;lt;signed short&amp;gt; pSourceImage = std::unique_ptr &amp;lt;signed short&amp;gt; (new short[sourceWidth * sourceHeight * sourceChannels]);
		std::unique_ptr &amp;lt;signed short&amp;gt; pDestImage   = std::unique_ptr &amp;lt;signed short&amp;gt; (new short[sourceWidth * sourceHeight * sourceChannels]);

		// fill reasonable data
		short convMatrix[3][3][3];
		convMatrix[0][0][0] = convMatrix[0][0][1] = convMatrix[0][0][2] = -1;
		convMatrix[0][1][0] = convMatrix[0][1][1] = convMatrix[0][1][2] = -2;
		convMatrix[0][2][0] = convMatrix[0][2][1] = convMatrix[0][2][2] = -1;
		convMatrix[1][0][0] = convMatrix[1][0][1] = convMatrix[1][0][2] = -2;
		convMatrix[1][1][0] = convMatrix[1][1][1] = convMatrix[1][1][2] = 13;
		convMatrix[1][2][0] = convMatrix[1][2][1] = convMatrix[1][2][2] = -2;
		convMatrix[2][0][0] = convMatrix[2][0][1] = convMatrix[2][0][2] = -1;
		convMatrix[2][1][0] = convMatrix[2][1][1] = convMatrix[2][1][2] = -2;
		convMatrix[2][2][0] = convMatrix[2][2][1] = convMatrix[2][2][2] = -1;
		for (int y = 0; y &amp;lt; sourceHeight; y++) {
			for (int x = 0; x &amp;lt; sourceWidth; x++) {
				int index = y*imageStep/2 + x * 3;
				pSourceImage.get ()[index    ] = x % 256;
				pSourceImage.get ()[index + 1] = y % 256;
				pSourceImage.get ()[index + 2] = (x + y) % 256;
			}
		}

		// old commands - always works
		IppStatus retIppiStatus = ippiConvValid_16s_C3R (
			pSourceImage.get (),  imageStep,   imageSize,		// original image: source
			&amp;amp;convMatrix[0][0][0], matrix1Step, matrixSize,		// convolution matrix
			pDestImage.get (),    imageStep,					// result image buffer: destination
			1													// convolution divider
		);

		// new commands: Often crashes
		std::unique_ptr &amp;lt;unsigned char&amp;gt; pAssistConvolution {nullptr};
		int buffSizeValid = 0;
		IppEnum algType = (IppEnum)(ippAlgAuto | ippiROIValid | ippiNormNone);
//		IppEnum algType = (IppEnum)(ippAlgAuto | ippiROIFull | ippiNormNone);
		ippiConvGetBufferSize (imageSize, matrixSize, ipp16s, sourceChannels, algType, &amp;amp;buffSizeValid);
		pAssistConvolution = std::unique_ptr &amp;lt;unsigned char&amp;gt; (new unsigned char [buffSizeValid]);

		retIppiStatus = ippiConv_16s_C3R (
			pSourceImage.get (),  imageStep,   imageSize,		// original image: source
			&amp;amp;convMatrix[0][0][0], matrix1Step, matrixSize,		// convolution matrix
			pDestImage.get (),    imageStep,					// result image buffer: destination
			1,													// matrix divisor paramaeter
			algType,											// convolution type
			pAssistConvolution.get()							// assitance buffer
		);
	}
	EndWaitCursor ();
}
&lt;/PRE&gt;

&lt;P&gt;&amp;lt;\pre&amp;gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 18 Feb 2016 07:01:09 GMT</pubDate>
    <dc:creator>l-eyal</dc:creator>
    <dc:date>2016-02-18T07:01:09Z</dc:date>
    <item>
      <title>ippiConv memory access problem</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippiConv-memory-access-problem/m-p/1077268#M24681</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I am trying for a very long time to convert code to work with non-deprecated IPP function (a&amp;nbsp;&lt;SPAN class="hps" style="font-size: 1em; line-height: 1.5;"&gt;Long-term&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt; &lt;/SPAN&gt;hobby for this forum participants). I am having difficulties converting the old&amp;nbsp;&lt;/SPAN&gt;ippiConvValid_16s_C3R&amp;nbsp;commands to the new&amp;nbsp;ippiConv_16s_C3R it ofter crash with "Unhandeld exception: Access violation reading location 0xffffffff".&lt;/P&gt;

&lt;P&gt;I am using Visual Strudio 2013, the code compiles with Intel Parallel Studio XE 2015 Update 4 Composer edition for C++ Windows Version 15.0.1386.12.XE, Compilation of 32 bit, with support for SSE 4.1/4.2. The code rum on Windows 7 32 bit based on i5 M520 cpu (and crashes also on Windows 8.1 64 bit with an i7 4790 processor).&amp;nbsp;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;IPP version: 8.2.2 (r46212); ippIP SSE4.1/4.2 (p8)+; from 1 Apr 2015.&lt;/SPAN&gt;&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;&amp;lt;pre brusk:cpp&amp;gt;
void CUnitTestsDlg::OnBnClickedConvolutionTestButton ()
{	// 15.2.16
	const IppLibraryVersion *pVer = ippiGetLibVersion ();
	BeginWaitCursor ();
	for (int i=0; i&amp;lt;1000; i++) {
		const int sourceWidth  = 1280;
		const int sourceHeight =  960;
		const int sourceChannels = 3;
		const int sourceDepthBits = 16 * sourceChannels;

		IppiSize imageSize;
		imageSize.width  = sourceWidth;
		imageSize.height = sourceHeight;
		int depthBytes   = sourceDepthBits / 8;
		int imageStep    = sourceWidth * depthBytes;

		IppiSize matrixSize;
		matrixSize.width  = 3;
		matrixSize.height = 3;
		int matrix1Step = matrixSize.width * depthBytes;

		std::unique_ptr &amp;lt;signed short&amp;gt; pSourceImage = std::unique_ptr &amp;lt;signed short&amp;gt; (new short[sourceWidth * sourceHeight * sourceChannels]);
		std::unique_ptr &amp;lt;signed short&amp;gt; pDestImage   = std::unique_ptr &amp;lt;signed short&amp;gt; (new short[sourceWidth * sourceHeight * sourceChannels]);

		// fill reasonable data
		short convMatrix[3][3][3];
		convMatrix[0][0][0] = convMatrix[0][0][1] = convMatrix[0][0][2] = -1;
		convMatrix[0][1][0] = convMatrix[0][1][1] = convMatrix[0][1][2] = -2;
		convMatrix[0][2][0] = convMatrix[0][2][1] = convMatrix[0][2][2] = -1;
		convMatrix[1][0][0] = convMatrix[1][0][1] = convMatrix[1][0][2] = -2;
		convMatrix[1][1][0] = convMatrix[1][1][1] = convMatrix[1][1][2] = 13;
		convMatrix[1][2][0] = convMatrix[1][2][1] = convMatrix[1][2][2] = -2;
		convMatrix[2][0][0] = convMatrix[2][0][1] = convMatrix[2][0][2] = -1;
		convMatrix[2][1][0] = convMatrix[2][1][1] = convMatrix[2][1][2] = -2;
		convMatrix[2][2][0] = convMatrix[2][2][1] = convMatrix[2][2][2] = -1;
		for (int y = 0; y &amp;lt; sourceHeight; y++) {
			for (int x = 0; x &amp;lt; sourceWidth; x++) {
				int index = y*imageStep/2 + x * 3;
				pSourceImage.get ()[index    ] = x % 256;
				pSourceImage.get ()[index + 1] = y % 256;
				pSourceImage.get ()[index + 2] = (x + y) % 256;
			}
		}

		// old commands - always works
		IppStatus retIppiStatus = ippiConvValid_16s_C3R (
			pSourceImage.get (),  imageStep,   imageSize,		// original image: source
			&amp;amp;convMatrix[0][0][0], matrix1Step, matrixSize,		// convolution matrix
			pDestImage.get (),    imageStep,					// result image buffer: destination
			1													// convolution divider
		);

		// new commands: Often crashes
		std::unique_ptr &amp;lt;unsigned char&amp;gt; pAssistConvolution {nullptr};
		int buffSizeValid = 0;
		IppEnum algType = (IppEnum)(ippAlgAuto | ippiROIValid | ippiNormNone);
//		IppEnum algType = (IppEnum)(ippAlgAuto | ippiROIFull | ippiNormNone);
		ippiConvGetBufferSize (imageSize, matrixSize, ipp16s, sourceChannels, algType, &amp;amp;buffSizeValid);
		pAssistConvolution = std::unique_ptr &amp;lt;unsigned char&amp;gt; (new unsigned char [buffSizeValid]);

		retIppiStatus = ippiConv_16s_C3R (
			pSourceImage.get (),  imageStep,   imageSize,		// original image: source
			&amp;amp;convMatrix[0][0][0], matrix1Step, matrixSize,		// convolution matrix
			pDestImage.get (),    imageStep,					// result image buffer: destination
			1,													// matrix divisor paramaeter
			algType,											// convolution type
			pAssistConvolution.get()							// assitance buffer
		);
	}
	EndWaitCursor ();
}
&lt;/PRE&gt;

&lt;P&gt;&amp;lt;\pre&amp;gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Feb 2016 07:01:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippiConv-memory-access-problem/m-p/1077268#M24681</guid>
      <dc:creator>l-eyal</dc:creator>
      <dc:date>2016-02-18T07:01:09Z</dc:date>
    </item>
    <item>
      <title>Is the step size for the</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippiConv-memory-access-problem/m-p/1077269#M24682</link>
      <description>Hello Eyal. 
Is the step size for the images correct?
you have:
int imageStep    = sourceWidth * depthBytes;
But it is a three channel interlaced image. So I think the step size should be:
int imageStep    = sourceWidth * depthBytes * sourceChannels;
By the way, why do you use c++ new and not IppiMalloc? 
p.s. Calling it a hobby is very very kind :).</description>
      <pubDate>Tue, 23 Feb 2016 14:06:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippiConv-memory-access-problem/m-p/1077269#M24682</guid>
      <dc:creator>b_k_</dc:creator>
      <dc:date>2016-02-23T14:06:48Z</dc:date>
    </item>
    <item>
      <title>Hi b.k.</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippiConv-memory-access-problem/m-p/1077270#M24683</link>
      <description>&lt;P&gt;Hi b.k.&lt;/P&gt;

&lt;P style="margin-top:0cm;margin-right:0cm;margin-bottom:18.0pt;margin-left:
0cm;line-height:14.65pt"&gt;&lt;SPAN style="font-size: 10pt; font-family: Arial, sans-serif;"&gt;Thanks for your reply.&lt;P&gt;&lt;/P&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin: 0cm 0cm 18pt; line-height: 14.65pt;"&gt;&lt;SPAN style="font-size: 10pt; font-family: Arial, sans-serif;"&gt;The imageStep parameter is correct, as the&amp;nbsp;sourceDepthBits&amp;nbsp;parameter already incorporates&amp;nbsp;the sourceChannels parameter and the depthBytes parameter is set correctly to 6.&lt;P&gt;&lt;/P&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-top:0cm;margin-right:0cm;margin-bottom:18.0pt;margin-left:
0cm;line-height:14.65pt"&gt;&lt;SPAN style="font-size: 10pt; font-family: Arial, sans-serif;"&gt;Also, the old deprecated command uses the same imageStep parameter and the same buffers, and it does not fail.&lt;P&gt;&lt;/P&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-top:0cm;margin-right:0cm;margin-bottom:18.0pt;margin-left:
0cm;line-height:14.65pt"&gt;&lt;SPAN style="font-size: 10pt; font-family: Arial, sans-serif;"&gt;Is there any benefit by using the &lt;/SPAN&gt;&lt;SPAN style="font-size: 9pt; font-family: Arial, sans-serif; background-image: initial; background-attachment: initial; background-size: initial; background-origin: initial; background-clip: initial; background-position: initial; background-repeat: initial;"&gt;IppiMalloc&lt;/SPAN&gt; over new (and std::unique_ptr)?&lt;SPAN style="font-size: 10pt; font-family: Arial, sans-serif;"&gt;&lt;P&gt;&lt;/P&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-top:0cm;margin-right:0cm;margin-bottom:18.0pt;margin-left:
0cm;line-height:14.65pt"&gt;&lt;SPAN style="font-size: 10pt; font-family: Arial, sans-serif;"&gt;Thanks,&lt;P&gt;&lt;/P&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-top:0cm;margin-right:0cm;margin-bottom:18.0pt;margin-left:
0cm;line-height:14.65pt"&gt;&lt;SPAN style="font-size: 10pt; font-family: Arial, sans-serif;"&gt;Eyal&lt;P&gt;&lt;/P&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 23 Feb 2016 16:32:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippiConv-memory-access-problem/m-p/1077270#M24683</guid>
      <dc:creator>l-eyal</dc:creator>
      <dc:date>2016-02-23T16:32:13Z</dc:date>
    </item>
    <item>
      <title>Hi</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippiConv-memory-access-problem/m-p/1077271#M24684</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;

&lt;P&gt;Sorry fo the long delay.&lt;/P&gt;

&lt;P&gt;using ippimalloc spares you the need to calculate bytes and sizes.&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;For example: Ipp16s* ippiMalloc_16s_C3(int widthPixels, int heightPixels, int* pStepBytes);&lt;/PRE&gt;

&lt;P&gt;&lt;SPAN class="delim"&gt;Gives you the image with each row start aligned to 32 \ 64 bytes, and the stepBytes value.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN class="delim"&gt;Since pointer arythmetic is very common with IPP, the stepBytes parameter in ippiMalloc saves a few bugs.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN class="delim"&gt;Aligned rows sould make memory read\write more effective. I never noticed any difference, but mabe I just used small images (640x480).&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN class="delim"&gt;From older posts the ippiMalloc functions are wrappers for standard malloc with some calculation of where the start point is. &lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;You can use ippiMalloc and compare the stepBytes you get to the one you calculate yourself. They should be the same.&lt;/P&gt;</description>
      <pubDate>Tue, 15 Mar 2016 08:41:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippiConv-memory-access-problem/m-p/1077271#M24684</guid>
      <dc:creator>b_k_</dc:creator>
      <dc:date>2016-03-15T08:41:26Z</dc:date>
    </item>
  </channel>
</rss>

