I'm getting strange crashes when deallocating memory and it occured to me it may be the way I'm casting my buffers to the IPP functions. For example, I'm allocating memory as follows:
float* SourceBuffer = (float*)_mm_malloc(BufferSize, 16);
float* TargetBuffer = (float*)_mm_malloc(BufferSize, 16);
When passing to an IPP function, I cast as follows:
ippResult =ippsAbs_32f((const Ipp32f*)SourceBuffer, (Ipp32f*)TargetBuffer, NumElements);
Is this kosher?
correct use of _mm_malloc function should not cause issues. Are you sure you calculate correct BufferSize? It should be something like BufferSize = NumElements * sizeof(Ipp32f)
Yes, Ipp32f is 4 bytes single precision floating point data type so your calculationshould be fine. Could you please check that using IPP memory allocation (which provide alignment on 32-bytes boundary) does not cause issue?
You can try either ippMalloc(NumElements*sizeof(Ipp32f)) or ippsMalloc_32f(NumElements) instead of _mm_malloc call. Note you will need to use ippFree or ippsFree accordingly.
Using the IPP allocations has no effect on the error. Still crashing at "_mm_free(Buffer)". Some buffers crash upon deallocations whereas others don't. I've noticed that the ones that are problematic are implicated with functions that require packing/unpacking, and these happen to be the ones where I was a little unsure of the arguments. For example, I've got the following:
This looks good to me. Have I misunderstood something in the docs?
Well, did you pay attention to different size of source and destination images for ippiPackToCplxExtend function you use? According to IPP documentation:PackToCplxExtend
Converts an image in packed format to a complex data image.
IppStatus ippiPackToCplxExtend_32s32sc_C1R(const Ipp32s* pSrc, IppiSize srcSize, int srcStep, Ipp32sc* pDst, int dstStep);
IppStatus ippiPackToCplxExtend_32f32fc_C1R(const Ipp32f* pSrc, IppiSize srcSize, int srcStep, Ipp32fc* pDst, int dstStep);
|Pointer to the source image ROI.|
|Size in pixels of the source image ROI.|
|Distance in bytes between starts of consecutive lines in the source buffer.|
|Pointer to the destination image buffer.|
|Distance in bytes between starts of consecutive lines in the destination image buffer.|
The function ippiPackToCplxExtend is declared in the ippi.h file. It operates with ROI (see Regions of Interest in Intel IPP).
This function converts the source image pSrc in RCPack2D format to complex data format and stores the results in pDst, which is a matrix with complete set of the Fourier coefficients. Note that if the pSrc in RCPack2D format is a real array of dimensions (NxM), then the pDst is a real array of dimensions (2xNxM). This should be taken into account when allocating memory for the function operation.
Yes. Please note my posted code has:
TempBuffer = (float*)_mm_malloc(BufferSize, 16); // (M x N x NumBytesInFloat)
UnpackBuffer = (float*)_mm_malloc(BufferSize * 2, 16); // (M x N x 2 x NumBytesInFloat)
Anyway, I may have misunderstood the documentation regarding one crucial aspect. Consider the following prototype:
IppStatus ippsConj_32fc_I(Ipp32fc* pSrcDst, int len);
where the docs state "int len =Number of elements in the vector." This was a little ambiguous. I interpreted it as the sum of real and imaginary elements in the vector, i.e. I have
int rows = 32;
int columns = 32;
int NumberOfValues = rows * columns;
int NumberOfComplexElements = NumberOfValues * 2;
ippResult = ippsConj_32fc_I((Ipp32fc*)UnpackBuffer, NumberOfComplexElements);
Could I have misunderstood something here?
IPP functions working with complex data type conisder elements as complex elements, so you should specify number of complex elements for ippsConj_32fs_I function as a lenght parameter (not sum of real and image parts).