Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
49 Views

Alternatives to _mm_malloc.

Hi,

I am working on a project which has its own memory allocator. As in it gets 400 MB from the system(via one malloc call) and then allocates memory from it. Its written in C (so no placement new operators). I can control the way memory is allocated, by that I mean I can align the memory given to different structures to a certain boundary, say 64. But, I cannot use _mm_malloc. I can use _assume_aligned.
So my question is how do I make the icc compiler see that I have aligned the memory and it should vectorize the for loops with alignment ON. Are there any intrinsic function which I can call to tell the compiler about the alignment of a pointer ? Is there some flag which I need to set to let the compiler see that the memory is aligned ? Or is there any other way to do so. Any pointers will be helpful. 

EDIT: I should have already added that I have been using the vector reports and the vector reports show for some reason that the pointers are unaligned but the loop was vectorizied using unaligned memory access. I want to gain performance by using aligned memory access. I have used "pragma vector aligned" but it hasn't yeilded any performance improvement. I have to use around 12 vectors in the array. Also, I can align the memory to a boundary but I cannot guarantee that the lower bound of the loop is going to be %64 == 0.

Regards,
Gaurav 

0 Kudos
2 Replies
Highlighted
Moderator
49 Views

 

Hi Gaurav,

If your code is aligned on certain boundary, and there is no dependency, then the compiler will automatically vectorize it. 

You can turn on the optimization report (-qopt-report-phase=vec) to get more details of your loop (i.e. vectorized or not)

Regards,

Viet 

0 Kudos
Highlighted
Black Belt
49 Views

_mm_malloc probably is a wrapper around std malloc which takes a block of sufficient size to return a pointer to aligned memory, "wasting " any misaligned leading bytes. If you have a reason to do so, you could write such a function yourself without requiring aligned_malloc, but the compiler will not recognize the alignment to improve optimization. So you would still require an assertion or simd intrinsic in general for optimizing..
0 Kudos