<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic @Jon: thanks for your reply, in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/Images-Stride-and-Memory-Alignment-Question-IPP/m-p/1110521#M25399</link>
    <description>&lt;P&gt;@Jon: thanks for your reply, however my question was slightly different. Your answer only guarantee that the first scanline in the buffer is properly aligned, I was asking about the other ones. I probably should tweak the stride in order to align *each* scanline.&lt;/P&gt;</description>
    <pubDate>Tue, 03 May 2016 06:05:02 GMT</pubDate>
    <dc:creator>Rietschin__Axel</dc:creator>
    <dc:date>2016-05-03T06:05:02Z</dc:date>
    <item>
      <title>Images, Stride and Memory Alignment Question (IPP)</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Images-Stride-and-Memory-Alignment-Question-IPP/m-p/1110518#M25396</link>
      <description>&lt;P&gt;Hello there,&lt;/P&gt;

&lt;P&gt;Is it worth to align scan lines in a image so each row&amp;nbsp;begins on 16-aligned memory? That is, round up the stride to the next multiple of 16 bytes?&lt;/P&gt;

&lt;P&gt;I assume this might help a bit when processing the entire image, but the real question is: does IPP cares?&lt;/P&gt;

&lt;P&gt;If yes, along the same line, is it worth to 32-align scan lines on CPUs that have&amp;nbsp;a 256 bit vector unit, or 64-align for AVX 512 chips?&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
	Axel&lt;/P&gt;</description>
      <pubDate>Fri, 29 Apr 2016 16:40:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Images-Stride-and-Memory-Alignment-Question-IPP/m-p/1110518#M25396</guid>
      <dc:creator>Rietschin__Axel</dc:creator>
      <dc:date>2016-04-29T16:40:29Z</dc:date>
    </item>
    <item>
      <title>Hi Alex,</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Images-Stride-and-Memory-Alignment-Question-IPP/m-p/1110519#M25397</link>
      <description>&lt;P&gt;Hi Alex,&lt;/P&gt;

&lt;P&gt;When IPP allocates memory buffers for image processing, it ensures that data is aligned appropriately. So the source image itself doesn't need&amp;nbsp;to be on&amp;nbsp;aligned memory for the best performance.&lt;/P&gt;

&lt;P&gt;Please see the below.&lt;/P&gt;

&lt;H2 class="sectiontitle" style="font: 200 27.99px/35px intel-clear, &amp;quot;Helvetica Neue&amp;quot;, Helvetica, Arial; margin: 0px; color: rgb(83, 86, 90); text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; white-space: normal; box-sizing: border-box; widows: 1; font-size-adjust: none; font-stretch: normal; background-color: rgb(255, 255, 255); -webkit-text-stroke-width: 0px;"&gt;&lt;SPAN class="option" style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; box-sizing: border-box;"&gt;Malloc&lt;/SPAN&gt;/&lt;SPAN class="option" style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; box-sizing: border-box;"&gt;Free&lt;/SPAN&gt;&lt;/H2&gt;

&lt;P style="font: 13px/1.4 Arial, Tahoma, Helvetica, sans-serif; margin: 0px 0px 1em; width: auto; color: rgb(102, 102, 102); text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; display: block; white-space: normal; -ms-word-wrap: break-word; max-width: 100%; box-sizing: border-box; widows: 1; font-size-adjust: none; font-stretch: normal; background-color: rgb(255, 255, 255); -webkit-text-stroke-width: 0px;"&gt;Intel IPP functions provide better performance if they process data with aligned pointers. Intel IPP provides the following functions to ensure that data is aligned appropriately - 16-byte for CPU that does not support Intel® Advanced Vector Extensions (Intel® AVX) instruction set, 32-byte for Intel AVX and Intel® Advanced Vector Extensions 2 (Intel® AVX2), and 64-byte for Intel® Many Integrated Core instructions.&lt;/P&gt;

&lt;P style="font: 13px/1.4 Arial, Tahoma, Helvetica, sans-serif; margin: 0px 0px 1em; width: auto; color: rgb(102, 102, 102); text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; display: block; white-space: normal; -ms-word-wrap: break-word; max-width: 100%; box-sizing: border-box; widows: 1; font-size-adjust: none; font-stretch: normal; background-color: rgb(255, 255, 255); -webkit-text-stroke-width: 0px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;PRE style="font: 13px/1.6em &amp;quot;Courier New&amp;quot;, Courier, monospace; margin: 1.6em 0px; color: rgb(102, 102, 102); text-transform: none; text-indent: 0px; letter-spacing: normal; clear: both; word-spacing: 0px; white-space: pre-wrap; box-sizing: border-box; widows: 1; font-size-adjust: none; font-stretch: normal; background-color: rgb(245, 245, 245); -webkit-text-stroke-width: 0px;"&gt;void* ippMalloc(int length)
void ippFree(void* ptr)
&lt;/PRE&gt;

&lt;P style="font: 13px/1.4 Arial, Tahoma, Helvetica, sans-serif; margin: 0px 0px 1em; width: auto; color: rgb(102, 102, 102); text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; display: block; white-space: normal; -ms-word-wrap: break-word; max-width: 100%; box-sizing: border-box; widows: 1; font-size-adjust: none; font-stretch: normal; background-color: rgb(255, 255, 255); -webkit-text-stroke-width: 0px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="font: 13px/1.4 Arial, Tahoma, Helvetica, sans-serif; margin: 0px 0px 1em; width: auto; color: rgb(102, 102, 102); text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; display: block; white-space: normal; -ms-word-wrap: break-word; max-width: 100%; box-sizing: border-box; widows: 1; font-size-adjust: none; font-stretch: normal; background-color: rgb(255, 255, 255); -webkit-text-stroke-width: 0px;"&gt;The &lt;SPAN class="option" style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; box-sizing: border-box;"&gt;ippMalloc&lt;/SPAN&gt; function provides appropriately aligned buffer, and the &lt;SPAN class="option" style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; box-sizing: border-box;"&gt;ippFree&lt;/SPAN&gt; function frees it.&lt;/P&gt;

&lt;P style="font: 13px/1.4 Arial, Tahoma, Helvetica, sans-serif; margin: 0px 0px 1em; width: auto; color: rgb(102, 102, 102); text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; display: block; white-space: normal; -ms-word-wrap: break-word; max-width: 100%; box-sizing: border-box; widows: 1; font-size-adjust: none; font-stretch: normal; background-color: rgb(255, 255, 255); -webkit-text-stroke-width: 0px;"&gt;The signal and image processing libraries provide &lt;SPAN class="option" style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; box-sizing: border-box;"&gt;ippsMalloc&lt;/SPAN&gt; and &lt;SPAN class="option" style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; box-sizing: border-box;"&gt;ippiMalloc&lt;/SPAN&gt; functions, respectively, to allocate appropriately aligned buffer that can be freed by the &lt;SPAN class="option" style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; box-sizing: border-box;"&gt;ippsFree&lt;/SPAN&gt; and &lt;SPAN class="option" style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; box-sizing: border-box;"&gt;ippiFree&lt;/SPAN&gt; functions.&lt;/P&gt;

&lt;P style="font: 13px/1.4 Arial, Tahoma, Helvetica, sans-serif; margin: 0px 0px 1em; width: auto; color: rgb(102, 102, 102); text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; display: block; white-space: normal; -ms-word-wrap: break-word; max-width: 100%; box-sizing: border-box; widows: 1; font-size-adjust: none; font-stretch: normal; background-color: rgb(255, 255, 255); -webkit-text-stroke-width: 0px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="font: 13px/1.4 Arial, Tahoma, Helvetica, sans-serif; margin: 0px 0px 1em; width: auto; color: rgb(102, 102, 102); text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; display: block; white-space: normal; -ms-word-wrap: break-word; max-width: 100%; box-sizing: border-box; widows: 1; font-size-adjust: none; font-stretch: normal; background-color: rgb(255, 255, 255); -webkit-text-stroke-width: 0px;"&gt;As one of&amp;nbsp;the example of&amp;nbsp;image processing applications, please refer : &lt;A href="https://software.intel.com/en-us/node/504353"&gt;https://software.intel.com/en-us/node/504353&lt;/A&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 02 May 2016 07:21:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Images-Stride-and-Memory-Alignment-Question-IPP/m-p/1110519#M25397</guid>
      <dc:creator>Jonghak_K_Intel</dc:creator>
      <dc:date>2016-05-02T07:21:52Z</dc:date>
    </item>
    <item>
      <title>Apropos ippMalloc, I looked</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Images-Stride-and-Memory-Alignment-Question-IPP/m-p/1110520#M25398</link>
      <description>Apropos ippMalloc, I looked with a debugger at its internal implementation on Mac OS X. I can be wrong, but it seems like ippMalloc calls malloc to allocate the requested size plus 0x44 bytes of memory and then returns an aligned pointer inside this memory area.

But Mac OS X and Linux have posix_memalign built-in. This call was specifically created to return aligned memory blocks.

    int posix_memalign(void **memptr, size_t alignment, size_t size);

Using posix_memalign has the following advantages

1. free can be used for all memory blocks, simplifying memory management (currently we have to know in the software if a memory block was allocated with malloc or ippMalloc)

2. Mac OS X has fantastic memory debugging facilities built-in, they work with posix_memalign but not with the current ippMalloc implementation

3. In a quick test, posix_memalign was twice as fast as ippMalloc

4. No extra bytes are required in the alignment (as posix_memalign is part of the OS)

Regards,

Adriaan van Os</description>
      <pubDate>Mon, 02 May 2016 13:11:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Images-Stride-and-Memory-Alignment-Question-IPP/m-p/1110520#M25398</guid>
      <dc:creator>Adriaan_van_Os</dc:creator>
      <dc:date>2016-05-02T13:11:11Z</dc:date>
    </item>
    <item>
      <title>@Jon: thanks for your reply,</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Images-Stride-and-Memory-Alignment-Question-IPP/m-p/1110521#M25399</link>
      <description>&lt;P&gt;@Jon: thanks for your reply, however my question was slightly different. Your answer only guarantee that the first scanline in the buffer is properly aligned, I was asking about the other ones. I probably should tweak the stride in order to align *each* scanline.&lt;/P&gt;</description>
      <pubDate>Tue, 03 May 2016 06:05:02 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Images-Stride-and-Memory-Alignment-Question-IPP/m-p/1110521#M25399</guid>
      <dc:creator>Rietschin__Axel</dc:creator>
      <dc:date>2016-05-03T06:05:02Z</dc:date>
    </item>
    <item>
      <title>@Alex</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/Images-Stride-and-Memory-Alignment-Question-IPP/m-p/1110522#M25400</link>
      <description>@Alex
My interpretation of Jon's answer is that it doesn't mattter, because data is copied to (aligned) buffers anyway.

@myself
With regard to posix_memalign I have to add the following

1. On Mac OS X, posix_memalign is available only since OS X 10.6

2. There is a bug in posix_memalign where calling it with a size of 0 and an alignment smaller than 512 triggers internal corruption of malloc's data structures; the symptom is "malloc: *** error for object 0xc3fa24: incorrect checksum for freed object - object was probably modified after being freed" warnings an a later (seemingly random) crash somewhere in the program.</description>
      <pubDate>Wed, 04 May 2016 08:31:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/Images-Stride-and-Memory-Alignment-Question-IPP/m-p/1110522#M25400</guid>
      <dc:creator>Adriaan_van_Os</dc:creator>
      <dc:date>2016-05-04T08:31:37Z</dc:date>
    </item>
  </channel>
</rss>

