Community
cancel
Showing results for 
Search instead for 
Did you mean: 
nyquist72
Beginner
72 Views

IPPMalloc size limitation

Hi, I am struggling with computational efficiencies handling 3D buffer.

My goal is to use pointer operations as much as possible duing the computation
due to heavy duty data accumulation problem inside the 3D buffer.

I already tried piecewise 2D array buffer to build 3D cube
but somehow I feel true 3D contiguous / memory alligned solution
might be faster and easy to handle.

From time to time though, I need to allocate more than 2 Gbytes of elements before
applying any IPP APIs.

Problem is that IPPMalloc() API accept 'int' type instead of 'long long' type.

Is there a way to solve this limitation when allocating huge contiguous memory?


Thanks in advance,


- Sunkyu Hwang


0 Kudos
4 Replies
renegr
New Contributor I
72 Views

Don't you know other allocation functions like malloc etc?
nyquist72
Beginner
72 Views

I have tried "_aligned_malloc()' and "_aligned_free()' yesterday.

Great, but not perfect, for example below program crashes badly

//////////////////////

int numElements = 1024 * 1024 * 1024;

unsigned short* dat = (unsigned short*)_aligned_malloc(numElements * sizeof(unsigned short), 32);

unsigned short* source = dat;

for(int i = 0; i < numElements; i++, source++)
{
(*source) = (i * 13) % 34;
}

_aligned_free(dat);

////////////////////// The below version works fine except we are under 2Gbyte quota /////////////

unsigned short* dat = ippsMalloc_16u(numElements);
unsigned short* source = dat;

for(int i = 0; i < numElements; i++, source++)
{
(*source) = (i * 13) % 34;
}

ippsFree(dat);

nyquist72
Beginner
72 Views

On second thought, I think I forgot the difference between Malloc and IPP calls.

The following API works fine from above problem, (My mistake!! )

unsigned short* dat = (unsigned short*)_aligned_malloc(numElements*sizeof(unsigned short), 32);
Chao_Y_Intel
Employee
72 Views

Hi,

Thanks for sharing the code here. It looks it can work now. For ippsMalloc functions, it juse uses system malloc function call, besides that, it may pad a few bytes to make sure it provide 32-byte aligned address.

Thanks,

Chao

Reply