- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I encounter invalid memory access with Inspector XE 2011 with the ippiTranspose_8u_C3R and ippiTranspose_8u_C4R functions when used with negative strides. See the code below for an example.
This sometimes leads to access violations in our application.
I am running the latest 7.0 update 7 IPP on Windows 64-bit, Intel Core i7-2720QM.
Can this be reproduced in your test environment?
Best regards,
Jurrien
src = (char*) calloc(img_size8, sizeof(char));
dst = (char*) calloc(img_size8, sizeof(char)); // Top Down, ok ippiTranspose_8u_C1R((Ipp8u *)src, w,(Ipp8u *)dst, h, sz); // Bottom up, ok src_end = src + w*(h-1); dst_end = dst + h*(w-1); ippiTranspose_8u_C1R((Ipp8u *)src_end, -w, (Ipp8u *)dst_end, -h, sz); free(src); free(dst); //-- 3-Byte --// src = (char*) calloc(img_size24, sizeof(char));
dst = (char*) calloc(img_size24, sizeof(char)); // Top Down, ok ippiTranspose_8u_C3R((Ipp8u *)src, w*3,(Ipp8u *)dst, h*3, sz); src_end = src + w*(h-1)*3; dst_end = dst + h*(w-1)*3; // Bottom Up, gives invalid Partial memory access in Inspector ippiTranspose_8u_C3R((Ipp8u *)src_end, -w*3, (Ipp8u *)dst_end, -h*3, sz); free(src); free(dst); //-- 4-Byte --// src = (char*) calloc(img_size32, sizeof(char));
dst = (char*) calloc(img_size32, sizeof(char)); // Top Down, ok ippiTranspose_8u_C4R((Ipp8u *)src, w*4,(Ipp8u *)dst, h*4, sz); src_end = src + w*(h-1)*4; dst_end = dst + h*(w-1)*4; // Bottom Up, gives invalid memory access in Inspector ippiTranspose_8u_C4R((Ipp8u *)src_end, -w*4, (Ipp8u *)dst_end, -h*4, sz); free(src); free(dst); return 0; } [/bash]
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
#include "stdafx.h"
#include
int _tmain(int argc, _TCHAR* argv[])
{
char *src, *dst, *src_end, *dst_end;
int w = 1792;
int h = 2560;
int img_size8 = w*h*1;
int img_size24 = w*h*3;
int img_size32 = w*h*4;
IppiSize sz;
sz.height = h;
sz.width = w;
//-- 1-Byte --//
//src = (char*) calloc(img_size8, sizeof(char));
//dst = (char*) calloc(img_size8, sizeof(char));
src = (char*) ippMalloc(img_size8);
dst = (char*) ippMalloc(img_size8);
// Top Down, ok
ippiTranspose_8u_C1R((Ipp8u *)src, w,(Ipp8u *)dst, h, sz);
// Bottom up, ok
src_end = src + w*(h-1);
dst_end = dst + h*(w-1);
ippiTranspose_8u_C1R((Ipp8u *)src_end, -w, (Ipp8u *)dst_end, -h, sz);
//free(src);
//free(dst);
ippFree( (void *)src );
ippFree( (void *)dst );
//-- 3-Byte --//
//src = (char*) calloc(img_size24, sizeof(char));
//dst = (char*) calloc(img_size24, sizeof(char));
src = (char*) ippMalloc(img_size24);
dst = (char*) ippMalloc(img_size24);
// Top Down, ok
ippiTranspose_8u_C3R((Ipp8u *)src, w*3,(Ipp8u *)dst, h*3, sz);
src_end = src + w*(h-1)*3;
dst_end = dst + h*(w-1)*3;
// Bottom Up, gives invalid Partial memory access in Inspector
ippiTranspose_8u_C3R((Ipp8u *)src_end, -w*3, (Ipp8u *)dst_end, -h*3, sz);
//free(src);
//free(dst);
ippFree( (void *)src );
ippFree( (void *)dst );
//-- 4-Byte --//
//src = (char*) calloc(img_size32, sizeof(char));
//dst = (char*) calloc(img_size32, sizeof(char));
src = (char*) ippMalloc(img_size32);
dst = (char*) ippMalloc(img_size32);
// Top Down, ok
ippiTranspose_8u_C4R((Ipp8u *)src, w*4,(Ipp8u *)dst, h*4, sz);
src_end = src + w*(h-1)*4;
dst_end = dst + h*(w-1)*4;
// Bottom Up, gives invalid memory access in Inspector
ippiTranspose_8u_C4R((Ipp8u *)src_end, -w*4, (Ipp8u *)dst_end, -h*4, sz);
//free(src);
//free(dst);
ippFree( (void *)src );
ippFree( (void *)dst );
return 0;
}
So I think you were probably running into an alignment issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It probably is an alignment issue, but a having a non 32-bit alignment is not supposed to give access violations in an application, just a decrease in speed.
The memory allocation of the source memory block which is processed in my application is not something I can control.
If you can also reproduce the invalid memory access with the Inspector I would consider this a bug that needs to be fixed in an upcoming update.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
in an application, just a decrease in speed...
If you have aMicrosoft Visual Studio 20xx Professional Editionplease look at source codes ofCRT-function 'calloc'.
You will see that'calloc' uses a Win32 API function 'HeapAlloc'. It is hard to believe that Microsoft developers
missed an allignment issue.
Also, CRT-functions'calloc' and 'malloc' are different by nature and take a look:
'malloc'
Allocates a memory block ( not initialized )
Declaration: void * malloc( size_t size )
Where, 'size' is a number of bytes to allocate
'calloc'
Allocates an array in memory with elements initialized to 0
Declaration: void * calloc( size_t num, size_t size )
Where, 'num' is a number of elements, and 'size' is a length in bytes of each element
IPP-function 'ippiMalloc' is similar to CRT-function 'malloc'.
>>...is not supposed to give access violations...
Some SSE instructionsand intrinsic functionsshould work with alligned memory blocks and if they are
not allignedan Access Violation exception is thrown.
Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The issue for ippiTranspose_8u_C3R remains, even after Iused yoursuggested changes.See the code below. To removesome'uninitialized memory access' warning from the Inspector output I added additional memory initialization.
So with the code below I still get 'Uninitialized partial memory access' in ippiTranspose_8u_C3R with the Inspector, while running the 32-bit 7.0 IPP update 7 on windows 7 64-bit. This is suspicous as this leads to chrashes in our application.
Can Intel verify that this is an issue in the implementation of ippiTranspose_8u_C3R?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What is your position in this?
I would expect that the use of a non 32-bit aligned pointer to a memory block would be detected by the library and handled appropriately, probably at the cost of some speed.
Best regards,
Jurrien
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What is your position in this?
I would expect that the use of a non 32-bit aligned pointer to a memory block would be detected by the library and handled appropriately, probably at the cost of some speed.
[SergeyK] Hi Jurrien, Intel Software Engineers could have a different point of view regarding usage
of 32-bit pointers. You're right regarding some problems with aspeed of processing
when non-alligned pointers are used.
Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Best regards,
Jurrien
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There has been many support topics in this forum about using w*pixsize instead of step, and of border issues etc.
So, think step, not w*pixsz.
With a bit of luck your problem might go away.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I took your suggestion and used ippiMalloc and the stride in bytes. Unfortunately the issue remains in ippiTranspose_8u_C3R. The inspector still gives me an 'Uninitialized partialmemory access'.
Is this an issue with the function, or am I doing something wrong? See the code below for the example.
[bash]#include "stdafx.h" #include
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What happens if you let the src have a positive step, and what happens if you let the dst have a positive step?
(of course, also change end -> begin in that case)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you providesome details on a function InitMemory(...)? It is not clear what it does internally.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(its right in front of you!)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(its right in front of you!)
Thanks, Thomas! Ididn't noticeit...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Thomas, when the direction is top-down (and src_step is positive) there is no issue. It is only with the negative step that this happens. This is the case where I see access violations in my application. I am strongly suspecting the ippiTranspose_8u_C3R for this behaviour. Running with the Inspector gives me the hint that something bad is happening in this function....
Can you or someone at Intel reproduce this? I am running the latest 32-bit IPP on Windows 7 64-bit on a SandyBridge CPU.
Cheers, Jurrien
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm also qurious about your Inspector, that it can detect reading uninitizalized memory. How can it do that?
(less important)
It could be that IPP is reading beyond the scanline because it is using SSE, but then the question is if reading outside should be considered OK ,when writing inside.
And you wrote that it crashes. That is more than a warning in an Inspector.
The crash could give you a hint if you look at the cpu view. The memory access error is then an address before or after your src or your dst.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Jurrien, have you ever solved this problem?
There was a similar problem in my src code few days ago. The error msg of the compiler was "Aborted.(core dump)".
I checked the src code and found that I access memory out of length I defined.
If the length of the data before transposition is L, you should allocate L*Nc length memory for the output of transposition.
Which, Nc means number of channel, Nc=3 if ippiTranspose_xx_C3R is used for instance.
Any method of memory allocation is OK.
Ippu8 *out = (Ipp8u*)malloc(L*Nc*sizeof(Ipp8u)); something like that.
command "malloc" can ensure address alignment.And maybe address alignment is not the main cause of this problem.
Hope that will be helpful.
BG
Charls
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page