- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
This problem is observed in versions ipp 2016.0.110 - 2019.3.203 when using avx optimization.
To repeat this situation, you need to allocate memory using the function VirtualAlloc. The end of the image must coincide with the end of the allocated block of memory. The width of the image should not be a multiple of the size of the ymm register (8 floats).
#include <vector> #include <ippcore.h> #include <ipps.h> #include <ippi.h> #include <ippcv.h> #include <Windows.h> #pragma comment(lib, "ippcoremt.lib") #pragma comment(lib, "ippsmt.lib") #pragma comment(lib, "ippcvmt.lib") int main() { // special image size that is a multiple of the page size int const w = 1020; int const h = 256; IppiBorderType const border_type = ippBorderMirror; IppiSize roi_size{ w, h }; int size_bytes = w * h * sizeof(float); // allocate memory, the address is rounded down to the next page boundary float * src = (float *)VirtualAlloc(nullptr, size_bytes, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE); float * dst = (float *)VirtualAlloc(nullptr, size_bytes, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE); // access violation with kernel size 3 in l9_ownFilterRowBorderPipeline_32f_C1R_3x3_G9E9cn() // std::vector<float> kernel_x = { 1, 2, 1}; // access violation with kernel size 5 in l9_ownFilterRowBorderPipeline_32f_C1R_5x5_G9E9cn() std::vector<float> kernel_x = { 1, 2, 3, 2, 1 }; int last_line_offset = w * (h - 1); int size_row = 0; auto sts = ippiFilterRowBorderPipelineGetBufferSize_32f_C1R(roi_size, static_cast<int>(kernel_x.size()), &size_row); if (ippStsNoErr != sts) return -1; std::vector<uint8_t> buff(size_row); float * pdst = dst + last_line_offset; // access violation sts = ippiFilterRowBorderPipeline_32f_C1R(src + last_line_offset, w * sizeof(float), &pdst, { w, 1 }, kernel_x.data(), static_cast<int>(kernel_x.size()), static_cast<int>(kernel_x.size()) / 2, border_type, 0, buff.data()); if (ippStsNoErr != sts) return -1; return 0; }
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What do you see in the case if will allocate memory via ippiMalloc_32f_C1R call?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't know how many memory allocated ippiMalloc_32f_C1R(size). If it guarantees that size + sizeof(__m256) will be allocated, there will be no access violation.
I don't understand why the function when reading allows you to go beyond the limit of the specified range [0, sizeof(float) * w].
If srcStep = 1020 * sizeof(float) = 4080 we need to read n = srcStep / sizeof(__m256) = 127 elements and process the remainder r = (srcStep - n * sizeof(__m256)) / sizeof(float).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, Vitaly!
Thank you for your feedback!
We have analyzed the problem, there is an issue with IPP, which we will fix in the upcoming releases.
Best regards,
Ivan Galanin.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page