<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Andres, in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/problems-with-ippiMalloc-8u-C4/m-p/1100503#M25142</link>
    <description>&lt;P&gt;Andres,&lt;/P&gt;

&lt;P&gt;ippiMalloc just calls the system malloc function to allocate the memory.&amp;nbsp;&amp;nbsp; If the memory address is not aligned, IPP is may call malloc function to a allocate a few more bytes, and return an aligned memory address.&amp;nbsp; So for you question 1)such address may not be expected.&amp;nbsp; so it needs to check if any code error there.&amp;nbsp; &amp;nbsp;&amp;nbsp;(2) if the width is the multiples of 64. It will use the continuous memory between the lines.&amp;nbsp;&amp;nbsp; You can check the step size, it is expected to the same as the width* pixel size. (3) it ca use ippiMalloc_8u_C4().&amp;nbsp; For &amp;nbsp;8u_C4() &amp;nbsp;function in the IPP, it means the image has 4 four channels(BGRA), each channel is 8 bytes.&lt;/P&gt;

&lt;P&gt;But if you define the image as “uint32_t”, the following code has the problem:&lt;/P&gt;

&lt;P&gt;int nImageSize = nWidth * nHeight * 4; &amp;nbsp; &amp;nbsp; // for BGRA, 4 bytes per pixel&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; for (nti &amp;nbsp;= 0; i &amp;lt; nImageSize; ++i) &amp;nbsp; *pDst++ &amp;nbsp; = &amp;nbsp;*pSrc++;&lt;/P&gt;

&lt;P&gt;it change either of the following code:&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; uint32_t *pImage ;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; for (nti &amp;nbsp;= 0; i &amp;lt; nWidth * nHeight; ++i) &amp;nbsp; *pDst++ &amp;nbsp; = &amp;nbsp;*pSrc++;&lt;/P&gt;

&lt;P&gt;or:&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Ipp8u *pImage ;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;for (int i &amp;nbsp;= 0; i &amp;lt; nWidth * nHeight*4; ++i) &amp;nbsp; *pDst++ &amp;nbsp; = &amp;nbsp;*pSrc++;&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
	Chao&lt;/P&gt;</description>
    <pubDate>Fri, 18 Dec 2015 04:33:00 GMT</pubDate>
    <dc:creator>Chao_Y_Intel</dc:creator>
    <dc:date>2015-12-18T04:33:00Z</dc:date>
    <item>
      <title>problems with ippiMalloc_8u_C4()</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/problems-with-ippiMalloc-8u-C4/m-p/1100502#M25141</link>
      <description>&lt;P&gt;I am having all sorts of pain chasing down a bug that crashes my app when exiting, that is, when I call all of the destructors and exit. Using gdb, I have noticed something that seems strange to me. &amp;nbsp; When I first call ippiMalloc_8u_C4() I get a very large/high pointer, but subsequent calls return much lower memory. &amp;nbsp;For example, my log will look like this:&lt;/P&gt;

&lt;P&gt;Dec 08 16:04:29.038 &amp;nbsp;***: ippiMalloc() returning ptr: 0x7f6a09347040&amp;nbsp;&lt;BR /&gt;
	Dec 08 16:04:29.053 &amp;nbsp;***: ippiMalloc() returning ptr: 0x2e8f9c0&amp;nbsp;&lt;BR /&gt;
	Dec 08 16:04:29.057 &amp;nbsp;***: ippiMalloc() returning ptr: 0x269ed80&amp;nbsp;&lt;/P&gt;

&lt;P&gt;and all subsequent calls (about 20) will return points similar to the last 2 above -- only the very first call to ippiMalloc() returns a large/high pointer. &amp;nbsp;This seems to be related to my crash bug, because I get a segfault in my logging functions on either the 2nd or 3rd free during shutdown. That is, when I call ippiFree() on the first large pointer (and sometimes the second smaller pointer above) all is fine. However, when I call ippiFree() on the 3rd pointer (and sometimes on the 2nd pointer), I get a segfault. &amp;nbsp;When I run separate test code on the objects that allocate that memory, I never get the large pointer on the first allocation (that is, all of the allocations yield the small/lower pointers) and I never get a segfault during shutdown.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Question #1: &amp;nbsp;what does ippiMalloc() do that would cause such variation in the size/location of the returned pointer. &amp;nbsp;Is this expected behavior?&lt;/P&gt;

&lt;P&gt;Question #2: &amp;nbsp;If both width and height are multiples of 64, will the memory returned my ippiMalloc() ever be discontinuous? &amp;nbsp;That is, will the beginning of a new line always immediately follow the end of the previous line? &amp;nbsp;Some posts on this forum indicate that both the W and H will be aligned. Will that alignment result in gaps in the memory? &amp;nbsp;That is, the follow code assumes contiguous memory without any gaps due to alignment but would break if rows had to be aligned:&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;int nImageSize = nWidth * nHeight * 4; &amp;nbsp; &amp;nbsp; // for BGRA, 4 bytes per pixel&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; for (int i = 0; i &amp;lt; nImageSize; ++i) &amp;nbsp; *pDst++ &amp;nbsp; = &amp;nbsp;*pSrc++;&lt;/P&gt;

&lt;P&gt;Question #3: &amp;nbsp;If I am processing BGRA images, should I use ippiMalloc_8u_C4() &amp;nbsp;and just cast it to an int pointer? &amp;nbsp; For example like this:&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp;uint32_t *pImage &amp;nbsp;= (uint32_t *) ippiMalloc_8u_C4(width, height, &amp;amp;step);&lt;/P&gt;

&lt;P&gt;this is somewhat confusing because you do not provide a: &amp;nbsp; &amp;nbsp; ippiMalloc_32u_C4() &amp;nbsp;only a signed version. &amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thanks for any help,&lt;/P&gt;

&lt;P&gt;-Andres&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 08 Dec 2015 22:32:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/problems-with-ippiMalloc-8u-C4/m-p/1100502#M25141</guid>
      <dc:creator>Andres_G_1</dc:creator>
      <dc:date>2015-12-08T22:32:25Z</dc:date>
    </item>
    <item>
      <title>Andres,</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/problems-with-ippiMalloc-8u-C4/m-p/1100503#M25142</link>
      <description>&lt;P&gt;Andres,&lt;/P&gt;

&lt;P&gt;ippiMalloc just calls the system malloc function to allocate the memory.&amp;nbsp;&amp;nbsp; If the memory address is not aligned, IPP is may call malloc function to a allocate a few more bytes, and return an aligned memory address.&amp;nbsp; So for you question 1)such address may not be expected.&amp;nbsp; so it needs to check if any code error there.&amp;nbsp; &amp;nbsp;&amp;nbsp;(2) if the width is the multiples of 64. It will use the continuous memory between the lines.&amp;nbsp;&amp;nbsp; You can check the step size, it is expected to the same as the width* pixel size. (3) it ca use ippiMalloc_8u_C4().&amp;nbsp; For &amp;nbsp;8u_C4() &amp;nbsp;function in the IPP, it means the image has 4 four channels(BGRA), each channel is 8 bytes.&lt;/P&gt;

&lt;P&gt;But if you define the image as “uint32_t”, the following code has the problem:&lt;/P&gt;

&lt;P&gt;int nImageSize = nWidth * nHeight * 4; &amp;nbsp; &amp;nbsp; // for BGRA, 4 bytes per pixel&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; for (nti &amp;nbsp;= 0; i &amp;lt; nImageSize; ++i) &amp;nbsp; *pDst++ &amp;nbsp; = &amp;nbsp;*pSrc++;&lt;/P&gt;

&lt;P&gt;it change either of the following code:&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; uint32_t *pImage ;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; for (nti &amp;nbsp;= 0; i &amp;lt; nWidth * nHeight; ++i) &amp;nbsp; *pDst++ &amp;nbsp; = &amp;nbsp;*pSrc++;&lt;/P&gt;

&lt;P&gt;or:&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Ipp8u *pImage ;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;for (int i &amp;nbsp;= 0; i &amp;lt; nWidth * nHeight*4; ++i) &amp;nbsp; *pDst++ &amp;nbsp; = &amp;nbsp;*pSrc++;&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
	Chao&lt;/P&gt;</description>
      <pubDate>Fri, 18 Dec 2015 04:33:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/problems-with-ippiMalloc-8u-C4/m-p/1100503#M25142</guid>
      <dc:creator>Chao_Y_Intel</dc:creator>
      <dc:date>2015-12-18T04:33:00Z</dc:date>
    </item>
  </channel>
</rss>

