Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Paulius_V_
Beginner
46 Views

KNL Fortran 64 byte alignment

Hello all. 

I encountered an error after recompiling my code with -align array64byte and using !dir$ vector aligned directive. 

serial job  x86 - ok, mpi job x86 - ok , serial MIC - ok, MPI MIC - array index out of bounds. 

I suspect that this might be due to the MPI implementation using some C++ modules. Why is there no 64-byte alignment option in icc? Is there an easy way of addressing this issue through compiler flags? If not, and C++ alignment is indeed the cause of this failure then I can exclude such variables but this would be a labor intensive endeavour. 

Any suggestions?

Many thanks

 

0 Kudos
4 Replies
jimdempseyatthecove
Black Belt
46 Views

Paulis,

Consider: overload new technique as well as overload malloc.

See: http://stackoverflow.com/questions/16270891/can-we-overload-malloc

Jim Dempsey

Paulius_V_
Beginner
46 Views

Thanks for the tip, Jim. So if I use polymorphism to address this, I'd need to implement my own version of malloc that ends up allocating along the 64 bye boundary? How would I make it so that my malloc takes precendence over the standard malloc? Especially if I don't want to allocate everything to be algined? 

If that's not an option then would I not just use mm_malloc?

Also, do you have any insight as to why there is not compiler flag for this?

 

Thanks

jimdempseyatthecove
Black Belt
46 Views

Both Windows and Linux have an API to perform aligned allocation (you also have the Intel mm_malloc as you have shown). You only need to write a (a few) shell function(s) that receives the malloc/free arguments and makes the appropriate call the aligned variants (e.g. mm_malloc). You can do the same with the C++ new, [] new, delete, [] delete.

Jim Dempsey

jimdempseyatthecove
Black Belt
46 Views

>> How would I make it so that my malloc takes precendence over the standard malloc? Especially if I don't want to allocate everything to be aligned? 

The two have contradictory objectives.

Define your overload to suit your needs or use the alternate allocation/deallocation.

Jim Dempsey

Reply