Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Alternatives to _mm_malloc.

Kadyan__Gaurav
Beginner
1,292 Views

Hi,

I am working on a project which has its own memory allocator. As in it gets 400 MB from the system(via one malloc call) and then allocates memory from it. Its written in C (so no placement new operators). I can control the way memory is allocated, by that I mean I can align the memory given to different structures to a certain boundary, say 64. But, I cannot use _mm_malloc. I can use _assume_aligned.
So my question is how do I make the icc compiler see that I have aligned the memory and it should vectorize the for loops with alignment ON. Are there any intrinsic function which I can call to tell the compiler about the alignment of a pointer ? Is there some flag which I need to set to let the compiler see that the memory is aligned ? Or is there any other way to do so. Any pointers will be helpful. 

EDIT: I should have already added that I have been using the vector reports and the vector reports show for some reason that the pointers are unaligned but the loop was vectorizied using unaligned memory access. I want to gain performance by using aligned memory access. I have used "pragma vector aligned" but it hasn't yeilded any performance improvement. I have to use around 12 vectors in the array. Also, I can align the memory to a boundary but I cannot guarantee that the lower bound of the loop is going to be %64 == 0.

Regards,
Gaurav 

0 Kudos
2 Replies
Viet_H_Intel
Moderator
1,292 Views

 

Hi Gaurav,

If your code is aligned on certain boundary, and there is no dependency, then the compiler will automatically vectorize it. 

You can turn on the optimization report (-qopt-report-phase=vec) to get more details of your loop (i.e. vectorized or not)

Regards,

Viet 

0 Kudos
TimP
Honored Contributor III
1,292 Views
_mm_malloc probably is a wrapper around std malloc which takes a block of sufficient size to return a pointer to aligned memory, "wasting " any misaligned leading bytes. If you have a reason to do so, you could write such a function yourself without requiring aligned_malloc, but the compiler will not recognize the alignment to improve optimization. So you would still require an assertion or simd intrinsic in general for optimizing..
0 Kudos
Reply