- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
The latest community version of IPP has a 4x performance regression in the ippiDilateBorder_16u_C1R function for largish neighborhoods. A sized 221x221 neighborhood in our use case seems to be affected (with an image size of 7002x8998), though I'm sure it's measurable for smaller neighboorhoods as well. I've seen this regression in Windows, haven't tested it in Linux, yet. This is while using a Haswell CPU. I'm not sure how much it matters, but the neighborhood is defined as 1 for all values.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Adam.
Could you please send ippcvGetLibVersion output of both versions?
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I currently don't have the original version, my binary was statically linked to it, but I believe the version that didn't have the regression was 2017 (with the latest update). The version that does is 2018 (both the initial version and the update). This was specifically in Windows - though I imagine the regression may exist on other platforms.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry to necrobump, but I ran into this today and the performance regression exists in versions as late as 2018 (haven't checked anything newer). Watching this in a loop with perf, it would seem that the max filter routine optimized for SSE variants of the architecture (l9_ownFilterMaxRowVH_16u_C1R and l9_ownFilterMaxColumnVH_16u_C1R) are orders of magnitude faster when the kernel size is large enough (1706x1706 kernel with 3709x5527 dimensioned input).
Any idea what's going on? Is there maybe a way I can use newer IPP but force it to use these older versions to get around this regression?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So doing a tiny bit of research and speculation on my part, I'm assuming the "VH" in those function names signify that function is performing the Van Herk algorithm (as in Van Herk/Gil-Werman). Also somewhat surprisingly, the straight MaxFilterBorder calls do a pretty naive approach to computing the max filter instead of the fast l9_ownFilterMaxRowVH_16u_C1R routines called by dilation in the IPP 9.0.3 of yore. Why did you guys rip out these functions and why weren't they called in the MaxFilterBorder functions to begin with? Are they patent encumbered? I have half a mind to attempt to implement these myself with SIMD intrinsics, but IPP already seems to have them there in earlier versions, so it seems like I'm needlessly reinventing the wheel.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page