<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Question on Gaussian convolution: 2d vs 1d 2-pass in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/Question-on-Gaussian-convolution-2d-vs-1d-2-pass/m-p/808260#M3739</link>
    <description>&lt;DIV&gt;Hi, experimenting with Gaussian blur the 3x3 kernel in ippiFilterGauss (per-documentation) is:&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;1/16, 2/16, 1/16,&lt;/DIV&gt;&lt;DIV&gt;2/16, 4/16, 2/16,&lt;/DIV&gt;&lt;DIV&gt;1/16, 2/16, 1/16&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;which has 1D equivalent of:&lt;/DIV&gt;&lt;DIV&gt;[1/4, 2/4, 1/4]&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;By convoluting 2x (horiz w/ ippiFilterRow32f, then the result of 1st convolution vertically w/ ippiFilterColumn32f) I should get the same result as convoluting 1x with 2D kernel (ippiFilterGauss/IppiFilter32); it should also be faster.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;But my results show that there are differences between the two. I am unsure if I am doing it wrongly, especially at the 2nd border extension.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;[cpp]// extend border via replication: ext1 is the returned border extended img; topExtend = 1, leftExtend = 1
IppStatus borderExtend(Ipp8u* img, IppiSize roi, int lineStep, Ipp8u** ext1, int&amp;amp; extStep1, 
	Ipp8u** startPt, int width, int height, int numChannels, int topExtend, int leftExtend)
{
	int extWidth = width + leftExtend * 2;
	int extHeight = height + topExtend * 2;

	IppStatus status;  
	IppiSize extRoi = { extWidth, extHeight };

	*ext1 = ippiMalloc_8u_C3(extWidth, extHeight, &amp;amp;extStep1);
	// shift i.e. for 3 channels interleaved, 1 extra pixel to left, 1 extra pixel down =&amp;gt; + (1 * step) + 3
	// + 3 as bmp is interleaved, need to shift by another 3 bytes over to get to next pixel
	*startPt = *ext1 + (leftExtend * extStep1) + numChannels;

	// copy over from buffer in image to ext1
	status = ippiCopy_8u_C3R(img, lineStep, *startPt, extStep1, roi);
	// extend by n pixel on each side
	status = ippiCopyReplicateBorder_8u_C3IR(*startPt, extStep1, roi, extRoi, topExtend, leftExtend);
	
	return status;
}[/cpp] [cpp]IppStatus blur(Ipp8u* img, IppiSize roi, int lineStep, int width, int height, int numChannels, int topExtend, int leftExtend)
{
	IppStatus status = ippStsNoErr;
	int ext1Line;
	Ipp8u* extend1 = NULL;
	Ipp8u* startPt1 = NULL;

	// extend border; output: extend1
	status = borderExtend(img, roi, lineStep, &amp;amp;extend1, ext1Line, &amp;amp;startPt1, width, height, numChannels, topExtend, leftExtend);

	// temp holding buffer for results after 1st pass
	int tempLine;
	Ipp8u* tempBuffer = ippiMalloc_8u_C3(width, height, &amp;amp;tempLine);

	Ipp32f kernel[] = {1/4.0f, 2/4.0f, 1/4.0f};

	// filter horiz with kernel; output: tempBuffer
	status = ippiFilterRow32f_8u_C3R(startPt1, ext1Line, tempBuffer, tempLine, roi, kernel, 3, 1);

	// extend again; output: extend2
	int ext2Line;
	Ipp8u* extend2 = NULL;
	Ipp8u* startPt2 = NULL;

	status = borderExtend(tempBuffer, roi, tempLine, &amp;amp;extend2, ext2Line, &amp;amp;startPt2, width, height, numChannels, 1, 1);

	// filter vert, output: img
	status = ippiFilterColumn32f_8u_C3R(startPt2, ext2Line, img, lineStep, roi, kernel, 3, 1);
	  
    return status;  
}[/cpp] My prog needs to handle different Gaussian kernels, hence the need to to 2-pass 1D convolution to maximize execution speed.&lt;/DIV&gt;</description>
    <pubDate>Tue, 05 Jun 2012 10:43:13 GMT</pubDate>
    <dc:creator>cks2k2</dc:creator>
    <dc:date>2012-06-05T10:43:13Z</dc:date>
    <item>
      <title>Question on Gaussian convolution: 2d vs 1d 2-pass</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Question-on-Gaussian-convolution-2d-vs-1d-2-pass/m-p/808260#M3739</link>
      <description>&lt;DIV&gt;Hi, experimenting with Gaussian blur the 3x3 kernel in ippiFilterGauss (per-documentation) is:&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;1/16, 2/16, 1/16,&lt;/DIV&gt;&lt;DIV&gt;2/16, 4/16, 2/16,&lt;/DIV&gt;&lt;DIV&gt;1/16, 2/16, 1/16&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;which has 1D equivalent of:&lt;/DIV&gt;&lt;DIV&gt;[1/4, 2/4, 1/4]&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;By convoluting 2x (horiz w/ ippiFilterRow32f, then the result of 1st convolution vertically w/ ippiFilterColumn32f) I should get the same result as convoluting 1x with 2D kernel (ippiFilterGauss/IppiFilter32); it should also be faster.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;But my results show that there are differences between the two. I am unsure if I am doing it wrongly, especially at the 2nd border extension.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;[cpp]// extend border via replication: ext1 is the returned border extended img; topExtend = 1, leftExtend = 1
IppStatus borderExtend(Ipp8u* img, IppiSize roi, int lineStep, Ipp8u** ext1, int&amp;amp; extStep1, 
	Ipp8u** startPt, int width, int height, int numChannels, int topExtend, int leftExtend)
{
	int extWidth = width + leftExtend * 2;
	int extHeight = height + topExtend * 2;

	IppStatus status;  
	IppiSize extRoi = { extWidth, extHeight };

	*ext1 = ippiMalloc_8u_C3(extWidth, extHeight, &amp;amp;extStep1);
	// shift i.e. for 3 channels interleaved, 1 extra pixel to left, 1 extra pixel down =&amp;gt; + (1 * step) + 3
	// + 3 as bmp is interleaved, need to shift by another 3 bytes over to get to next pixel
	*startPt = *ext1 + (leftExtend * extStep1) + numChannels;

	// copy over from buffer in image to ext1
	status = ippiCopy_8u_C3R(img, lineStep, *startPt, extStep1, roi);
	// extend by n pixel on each side
	status = ippiCopyReplicateBorder_8u_C3IR(*startPt, extStep1, roi, extRoi, topExtend, leftExtend);
	
	return status;
}[/cpp] [cpp]IppStatus blur(Ipp8u* img, IppiSize roi, int lineStep, int width, int height, int numChannels, int topExtend, int leftExtend)
{
	IppStatus status = ippStsNoErr;
	int ext1Line;
	Ipp8u* extend1 = NULL;
	Ipp8u* startPt1 = NULL;

	// extend border; output: extend1
	status = borderExtend(img, roi, lineStep, &amp;amp;extend1, ext1Line, &amp;amp;startPt1, width, height, numChannels, topExtend, leftExtend);

	// temp holding buffer for results after 1st pass
	int tempLine;
	Ipp8u* tempBuffer = ippiMalloc_8u_C3(width, height, &amp;amp;tempLine);

	Ipp32f kernel[] = {1/4.0f, 2/4.0f, 1/4.0f};

	// filter horiz with kernel; output: tempBuffer
	status = ippiFilterRow32f_8u_C3R(startPt1, ext1Line, tempBuffer, tempLine, roi, kernel, 3, 1);

	// extend again; output: extend2
	int ext2Line;
	Ipp8u* extend2 = NULL;
	Ipp8u* startPt2 = NULL;

	status = borderExtend(tempBuffer, roi, tempLine, &amp;amp;extend2, ext2Line, &amp;amp;startPt2, width, height, numChannels, 1, 1);

	// filter vert, output: img
	status = ippiFilterColumn32f_8u_C3R(startPt2, ext2Line, img, lineStep, roi, kernel, 3, 1);
	  
    return status;  
}[/cpp] My prog needs to handle different Gaussian kernels, hence the need to to 2-pass 1D convolution to maximize execution speed.&lt;/DIV&gt;</description>
      <pubDate>Tue, 05 Jun 2012 10:43:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Question-on-Gaussian-convolution-2d-vs-1d-2-pass/m-p/808260#M3739</guid>
      <dc:creator>cks2k2</dc:creator>
      <dc:date>2012-06-05T10:43:13Z</dc:date>
    </item>
    <item>
      <title>Question on Gaussian convolution: 2d vs 1d 2-pass</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Question-on-Gaussian-convolution-2d-vs-1d-2-pass/m-p/808261#M3740</link>
      <description>&lt;P&gt;Hello, &lt;BR /&gt;&lt;BR /&gt;For the following code:&lt;BR /&gt; &amp;gt;*startPt = *ext1 + (leftExtend * extStep1) + numChannels;&lt;BR /&gt;&lt;BR /&gt;is this something like?&lt;BR /&gt; &amp;gt;*startPt = *ext1 + (topExtend * extStep1) + numChannels*leftExternd;&lt;BR /&gt;&lt;BR /&gt;I do not find much other problem? If you have some runable code, that may be helpful to reproduce the problem easily.&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Chao&lt;/P&gt;</description>
      <pubDate>Wed, 06 Jun 2012 06:26:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Question-on-Gaussian-convolution-2d-vs-1d-2-pass/m-p/808261#M3740</guid>
      <dc:creator>Chao_Y_Intel</dc:creator>
      <dc:date>2012-06-06T06:26:09Z</dc:date>
    </item>
    <item>
      <title>Question on Gaussian convolution: 2d vs 1d 2-pass</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Question-on-Gaussian-convolution-2d-vs-1d-2-pass/m-p/808262#M3741</link>
      <description>Thanks for looking thru the code, my code would have failed for border extensions &amp;gt; 1 pixel.&lt;DIV&gt;But after making the corrections, my 1D result differs from the 2D result (via an img diff program).&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;I have attached the input bitmap (testa.bmp), the resulting output (testoutg2d.bmp - using ippFilterGauss,testoutg1d.bmp - using FilterRow32f, filterColumn32f) and the src code.&lt;BR /&gt;&lt;BR /&gt;(Not sure if I have attached files correctly)&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 06 Jun 2012 07:19:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Question-on-Gaussian-convolution-2d-vs-1d-2-pass/m-p/808262#M3741</guid>
      <dc:creator>cks2k2</dc:creator>
      <dc:date>2012-06-06T07:19:48Z</dc:date>
    </item>
  </channel>
</rss>

