<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Sorry to necrobump, but I ran in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippiDilateBorder-16u-C1R-performance-regression-in-2018/m-p/1136103#M25952</link>
    <description>&lt;P&gt;Sorry to necrobump, but I ran into this today and the performance regression exists in versions as late as 2018 (haven't checked anything newer).&amp;nbsp; Watching this in a loop with perf, it would seem that the max filter routine optimized for SSE variants of the architecture (l9_ownFilterMaxRowVH_16u_C1R and l9_ownFilterMaxColumnVH_16u_C1R) are orders of magnitude faster when the kernel size is large enough (1706x1706 kernel with 3709x5527 dimensioned input).&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any idea what's going on?&amp;nbsp; Is there maybe a way I can use newer IPP but force it to use these older versions to get around this regression?&lt;/P&gt;</description>
    <pubDate>Mon, 12 Aug 2019 14:52:01 GMT</pubDate>
    <dc:creator>adam_s_</dc:creator>
    <dc:date>2019-08-12T14:52:01Z</dc:date>
    <item>
      <title>ippiDilateBorder_16u_C1R performance regression in 2018</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippiDilateBorder-16u-C1R-performance-regression-in-2018/m-p/1136100#M25949</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;The latest community version of IPP has a 4x performance regression in the&amp;nbsp;ippiDilateBorder_16u_C1R function for largish neighborhoods.&amp;nbsp; A sized 221x221 neighborhood in our use case seems to be affected (with an image size of 7002x8998), though I'm sure it's measurable for smaller neighboorhoods as well.&amp;nbsp; I've seen this regression in Windows, haven't tested it in Linux, yet.&amp;nbsp; This is while using a Haswell CPU.&amp;nbsp; I'm not sure how much it matters, but the neighborhood is defined as 1 for all values.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Feb 2018 19:03:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippiDilateBorder-16u-C1R-performance-regression-in-2018/m-p/1136100#M25949</guid>
      <dc:creator>adam_s_</dc:creator>
      <dc:date>2018-02-08T19:03:41Z</dc:date>
    </item>
    <item>
      <title>Hi Adam.</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippiDilateBorder-16u-C1R-performance-regression-in-2018/m-p/1136101#M25950</link>
      <description>&lt;P&gt;Hi Adam.&lt;/P&gt;

&lt;P&gt;Could you please&amp;nbsp;send ippcvGetLibVersion output of both versions?&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thanks.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 22 Feb 2018 11:26:08 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippiDilateBorder-16u-C1R-performance-regression-in-2018/m-p/1136101#M25950</guid>
      <dc:creator>Andrey_B_Intel</dc:creator>
      <dc:date>2018-02-22T11:26:08Z</dc:date>
    </item>
    <item>
      <title>I currently don't have the</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippiDilateBorder-16u-C1R-performance-regression-in-2018/m-p/1136102#M25951</link>
      <description>&lt;P&gt;I currently don't have the original version, my binary was statically linked to it, but I believe the version that didn't have the regression was 2017 (with the latest update).&amp;nbsp; The version that does is 2018 (both the initial version and the update).&amp;nbsp; This was specifically in Windows - though I imagine the regression may exist on other platforms.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Feb 2018 23:05:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippiDilateBorder-16u-C1R-performance-regression-in-2018/m-p/1136102#M25951</guid>
      <dc:creator>adam_s_</dc:creator>
      <dc:date>2018-02-22T23:05:00Z</dc:date>
    </item>
    <item>
      <title>Sorry to necrobump, but I ran</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippiDilateBorder-16u-C1R-performance-regression-in-2018/m-p/1136103#M25952</link>
      <description>&lt;P&gt;Sorry to necrobump, but I ran into this today and the performance regression exists in versions as late as 2018 (haven't checked anything newer).&amp;nbsp; Watching this in a loop with perf, it would seem that the max filter routine optimized for SSE variants of the architecture (l9_ownFilterMaxRowVH_16u_C1R and l9_ownFilterMaxColumnVH_16u_C1R) are orders of magnitude faster when the kernel size is large enough (1706x1706 kernel with 3709x5527 dimensioned input).&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any idea what's going on?&amp;nbsp; Is there maybe a way I can use newer IPP but force it to use these older versions to get around this regression?&lt;/P&gt;</description>
      <pubDate>Mon, 12 Aug 2019 14:52:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippiDilateBorder-16u-C1R-performance-regression-in-2018/m-p/1136103#M25952</guid>
      <dc:creator>adam_s_</dc:creator>
      <dc:date>2019-08-12T14:52:01Z</dc:date>
    </item>
    <item>
      <title>So doing a tiny bit of</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippiDilateBorder-16u-C1R-performance-regression-in-2018/m-p/1136104#M25953</link>
      <description>&lt;P&gt;So doing a tiny bit of research and speculation on my part, I'm assuming the "VH" in those function names signify that function is performing the Van Herk algorithm (as in Van Herk/Gil-Werman).&amp;nbsp; Also somewhat surprisingly, the straight MaxFilterBorder calls do a pretty naive approach to computing the max filter instead of the fast&amp;nbsp;l9_ownFilterMaxRowVH_16u_C1R routines called by dilation in the IPP 9.0.3 of yore.&amp;nbsp; Why did you guys rip out these functions and why weren't they called in the MaxFilterBorder functions to begin with?&amp;nbsp; Are they patent encumbered?&amp;nbsp; I have half a mind to attempt to implement these myself with SIMD intrinsics, but IPP already seems to have them there in earlier versions, so it seems like I'm needlessly reinventing the wheel.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Aug 2019 17:54:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippiDilateBorder-16u-C1R-performance-regression-in-2018/m-p/1136104#M25953</guid>
      <dc:creator>adam_s_</dc:creator>
      <dc:date>2019-08-13T17:54:06Z</dc:date>
    </item>
  </channel>
</rss>

