- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My goal is to use pointer operations as much as possible duing the computation
due to heavy duty data accumulation problem inside the 3D buffer.
I already tried piecewise 2D array buffer to build 3D cube
but somehow I feel true 3D contiguous / memory alligned solution
might be faster and easy to handle.
From time to time though, I need to allocate more than 2 Gbytes of elements before
applying any IPP APIs.
Problem is that IPPMalloc() API accept 'int' type instead of 'long long' type.
Is there a way to solve this limitation when allocating huge contiguous memory?
Thanks in advance,
- Sunkyu Hwang
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great, but not perfect, for example below program crashes badly
//////////////////////
int numElements = 1024 * 1024 * 1024;
unsigned short* dat = (unsigned short*)_aligned_malloc(numElements * sizeof(unsigned short), 32);
unsigned short* source = dat;
for(int i = 0; i < numElements; i++, source++)
{
(*source) = (i * 13) % 34;
}
_aligned_free(dat);
////////////////////// The below version works fine except we are under 2Gbyte quota /////////////
unsigned short* dat = ippsMalloc_16u(numElements);
unsigned short* source = dat;
for(int i = 0; i < numElements; i++, source++)
{
(*source) = (i * 13) % 34;
}
ippsFree(dat);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The following API works fine from above problem, (My mistake!! )
unsigned short* dat = (unsigned short*)_aligned_malloc(numElements*sizeof(unsigned short), 32);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for sharing the code here. It looks it can work now. For ippsMalloc functions, it juse uses system malloc function call, besides that, it may pad a few bytes to make sure it provide 32-byte aligned address.
Thanks,
Chao

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page