Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

IPPMalloc size limitation

nyquist72
Beginner
329 Views
Hi, I am struggling with computational efficiencies handling 3D buffer.

My goal is to use pointer operations as much as possible duing the computation
due to heavy duty data accumulation problem inside the 3D buffer.

I already tried piecewise 2D array buffer to build 3D cube
but somehow I feel true 3D contiguous / memory alligned solution
might be faster and easy to handle.

From time to time though, I need to allocate more than 2 Gbytes of elements before
applying any IPP APIs.

Problem is that IPPMalloc() API accept 'int' type instead of 'long long' type.

Is there a way to solve this limitation when allocating huge contiguous memory?


Thanks in advance,


- Sunkyu Hwang


0 Kudos
4 Replies
renegr
New Contributor I
329 Views
Don't you know other allocation functions like malloc etc?
0 Kudos
nyquist72
Beginner
329 Views
I have tried "_aligned_malloc()' and "_aligned_free()' yesterday.

Great, but not perfect, for example below program crashes badly

//////////////////////

int numElements = 1024 * 1024 * 1024;

unsigned short* dat = (unsigned short*)_aligned_malloc(numElements * sizeof(unsigned short), 32);

unsigned short* source = dat;

for(int i = 0; i < numElements; i++, source++)
{
(*source) = (i * 13) % 34;
}

_aligned_free(dat);

////////////////////// The below version works fine except we are under 2Gbyte quota /////////////

unsigned short* dat = ippsMalloc_16u(numElements);
unsigned short* source = dat;

for(int i = 0; i < numElements; i++, source++)
{
(*source) = (i * 13) % 34;
}

ippsFree(dat);

0 Kudos
nyquist72
Beginner
329 Views
On second thought, I think I forgot the difference between Malloc and IPP calls.

The following API works fine from above problem, (My mistake!! )

unsigned short* dat = (unsigned short*)_aligned_malloc(numElements*sizeof(unsigned short), 32);
0 Kudos
Chao_Y_Intel
Moderator
329 Views

Hi,

Thanks for sharing the code here. It looks it can work now. For ippsMalloc functions, it juse uses system malloc function call, besides that, it may pad a few bytes to make sure it provide 32-byte aligned address.

Thanks,

Chao

0 Kudos
Reply