Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Alternatives to _mm_malloc.



I am working on a project which has its own memory allocator. As in it gets 400 MB from the system(via one malloc call) and then allocates memory from it. Its written in C (so no placement new operators). I can control the way memory is allocated, by that I mean I can align the memory given to different structures to a certain boundary, say 64. But, I cannot use _mm_malloc. I can use _assume_aligned.
So my question is how do I make the icc compiler see that I have aligned the memory and it should vectorize the for loops with alignment ON. Are there any intrinsic function which I can call to tell the compiler about the alignment of a pointer ? Is there some flag which I need to set to let the compiler see that the memory is aligned ? Or is there any other way to do so. Any pointers will be helpful. 

EDIT: I should have already added that I have been using the vector reports and the vector reports show for some reason that the pointers are unaligned but the loop was vectorizied using unaligned memory access. I want to gain performance by using aligned memory access. I have used "pragma vector aligned" but it hasn't yeilded any performance improvement. I have to use around 12 vectors in the array. Also, I can align the memory to a boundary but I cannot guarantee that the lower bound of the loop is going to be %64 == 0.


0 Kudos
2 Replies


Hi Gaurav,

If your code is aligned on certain boundary, and there is no dependency, then the compiler will automatically vectorize it. 

You can turn on the optimization report (-qopt-report-phase=vec) to get more details of your loop (i.e. vectorized or not)



0 Kudos
Black Belt
_mm_malloc probably is a wrapper around std malloc which takes a block of sufficient size to return a pointer to aligned memory, "wasting " any misaligned leading bytes. If you have a reason to do so, you could write such a function yourself without requiring aligned_malloc, but the compiler will not recognize the alignment to improve optimization. So you would still require an assertion or simd intrinsic in general for optimizing..
0 Kudos