- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
My goal is to use pointer operations as much as possible duing the computation
due to heavy duty data accumulation problem inside the 3D buffer.
I already tried piecewise 2D array buffer to build 3D cube
but somehow I feel true 3D contiguous / memory alligned solution
might be faster and easy to handle.
From time to time though, I need to allocate more than 2 Gbytes of elements before
applying any IPP APIs.
Problem is that IPPMalloc() API accept 'int' type instead of 'long long' type.
Is there a way to solve this limitation when allocating huge contiguous memory?
Thanks in advance,
- Sunkyu Hwang
링크가 복사됨
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Great, but not perfect, for example below program crashes badly
//////////////////////
int numElements = 1024 * 1024 * 1024;
unsigned short* dat = (unsigned short*)_aligned_malloc(numElements * sizeof(unsigned short), 32);
unsigned short* source = dat;
for(int i = 0; i < numElements; i++, source++)
{
(*source) = (i * 13) % 34;
}
_aligned_free(dat);
////////////////////// The below version works fine except we are under 2Gbyte quota /////////////
unsigned short* dat = ippsMalloc_16u(numElements);
unsigned short* source = dat;
for(int i = 0; i < numElements; i++, source++)
{
(*source) = (i * 13) % 34;
}
ippsFree(dat);
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
The following API works fine from above problem, (My mistake!! )
unsigned short* dat = (unsigned short*)_aligned_malloc(numElements*sizeof(unsigned short), 32);
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Hi,
Thanks for sharing the code here. It looks it can work now. For ippsMalloc functions, it juse uses system malloc function call, besides that, it may pad a few bytes to make sure it provide 32-byte aligned address.
Thanks,
Chao