- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello.
Do I need to allocate memory for arrays only with ippiMalloc (it allocates 32-bytes aligned memory) or not?
Does using other memory allocating functions (malloc, ...) affects to performance?
Code:
IppStatus GaussFilter (Ipp32f* pSrc, const int nWidth, const int nHeight, const Ipp32f fSigma, Ipp32f* pDst)
{
int nKernelSize = 7;
IppiSize tWholePic = {nWidth, nHeight};
int nStepBytes = 0;
pDst = ippiMalloc_32f_C1 (nWidth, nHeight, &nStepBytes);
int nBorderBufferSize = 0;
Ippi8u* pBorderBuffer = ippiFilterGaussGetBufferSetSize_32f_C1R (tWholePic, nKernelSize, &nBorderBufferSize);
ippiFilterGaussBorder_32f_C1R (pSrc, nWidth * sizeof (Ipp32f)
, pDst, nWidth * sizeof (Ipp32f) // Have I use this one or nStepBytes receipt from ippiMalloc?
, tWholePic
, nKernelSize, fSigma, ippBorderRepl, 0.
, pBorderBuffer);
}
void tmain(...)
{
int nWidth = 12, nHeight = 15;
Ipp32f fSigma = 1.;
Ipp32f pSrc[nWidth * nHeight] ;
Ipp32f* pDst = NULL;
// Initialize pSrc
GaussFilter (pSrc, nWidth, nHeight, pDst);
// Do something with pDst.
if (pDst != NULL)
ippFree (pDst);
}
Regards,
Mark
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Other malloc functions also work. ippsMalloc is actually calling the system malloc function, and make the memory 32 bit/64bit alignment. From the performance point, it is the better if the input data is address is 32bit or 64 bit alignment( for the machine support AVX instructions).
Thanks,
Chao
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
In the source code, where you ask "Have I use...", you need to use nStepBytes, because the step is not always equal to nWidth*sizeof. Otherwise, there is a risk of missing of memory alignment benefits.
Regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks.
regards,
Mark.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, guys.
I have one more question relates to above theme. :)
If I use memory aligning, how can I define memory with real data and trash memory (allocated to align 32/64 boundary)? Are there some helper functions to define neccessary and trash mem? Are other functions know about the trash memory? It seems that this done by nStepbytes parameter, isn't it?
const int nHeight = 15, nWidth = 8;
IppiSize tsWholePic = {nWidth, nHeight};
Ipp32f pSrc[nHeight * nWidth];
ippiSet_32f_C1R (2., pSrc, nWidth * sizeof (Ipp32f), tsWholePic);
int nDstStepBytes = 0;
Ipp32f* pDst = ippiMalloc_32f_C1R (nWidth, nHeight, &nDstStepBytes); // In due of mem align pDst has trash memory parts. See pic
ippiCopy_32f_C1R (pSrc, nWidth * sizeof (Ipp32f), pDst, nDstStepBytes, tsWholePic); // Is this right copying? Step bytes are different // for pSrc and pDst
Regards,
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Mark,
That's correct. Any "step_bytes" parameter in IPP image processing function defines how many bytes to add to the beginning of previous image row to position to the beginning of next image row. So, "nWidth*sizeof(Ipp32f)" and "nDstStepBytes" both are correct as src and dst steps.
Regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I see. Payment for speed and comfort. :)
Merry Christmass,
Thanks a lot,
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hello
I see. Thanks a lot.
Merry Christmas, guys (yesterday I couldn't add post to the forum, something happened with site.)
Regards,
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, thanks a lot.
regards,
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
As I understand, in ippiMallocated structures any row in image is aligned to 32/64 border, so bytes are added to the end of the previous row. And the situation seems follow: for static and dynamic (allocated with malloc) I can use nWidth * sizeof (Type_Of_Array). For dynamic arrays, allocated with ippiMalloc I have to use nStepbytes.
schema in ippiMallocated array
xxxx1............a............b............c............xxxxxxxxxx2............a............b............c............ixxxxxxxxxx
1, 2 - address in memory aligned to 32/64
xxxxx - trash memory, added to align
2 - 1 = nStepBytes,
&c - 1 = nWidth * sizeof(Type_Of_Array)
regards,
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Thanks a lot.
Regards,
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Mark,
There are aligned mallocs in various OSes (_align_malloc, posix_memalign and others), but they provide only alignment of the very first byte of allocated memory, whereas in image processing the beginnings of each image line should be aligned for better performance.
Regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I see. I have made a lot of tests and discovered this feature of ippiMalloc. :)
Thank a lot.
best regards,
Mark.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page