Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Kevin_F_
Beginner
98 Views

How to align the 3D array which is dynamic allocated by new operator to 64 bytes?

Hi!

I am using pointer to dynamic allocate a 3D array like this:

double ***p;

p=new double** [x_dimension];
    for (int i=0; i<x_dimension; i++)
    {
        phi=new double* [y_dimension];
        for(int j=0; j<y_dimension; j++)
        {
            phi=new double [z_dimension];
        }
    }

How can I rewrite the code to make sure the allocated memory is 64 bytes alignment?

Thanks!

0 Kudos
7 Replies
jimdempseyatthecove
Black Belt
98 Views

Andrey_Vladimirov
New Contributor III
98 Views

At least in Linux, Intel compilers support their own allocators _mm_malloc. The syntax is:

phi = (double*) _mm_malloc(sizeof(double)*z_dimension, 64);

// ...

_mm_free(phi);

 

If you must use operator new, you can use the placement version, but you have to call _mm_free explicitly to release memory:

 

#include <new>

phi = new ( (double*) _mm_malloc(sizeof(double)*z_dimension) ) double[z_dimension];

// ...

_mm_free(phi);

// ...

The placement version of operator new is actually more useful for aligned class objects than for arrays. Note that you cannot delete the memory for that object in the usual way. You have to call the constructor followed by a call to _mm_free:

#include <new>

class MyClassType {
 // ...
};

MyClassType* c = new ( (MyClassType*) _mm_malloc(sizeof(MyClassType) ) MyClassType;

// ...

c->~MyClassType();
_mm_free(c);

 

 

SergeyKostrov
Valued Contributor II
98 Views

>>...How can I rewrite the code to make sure the allocated memory is 64 bytes alignment? It was properly mentioned that usage of Intel Intrinsic or CRT-functions, like _mm_malloc, or aligned_malloc, or aligned_offset_malloc, will allow you to allocate aligned memory blocks. Here are a couple of comments regarding: >>... >>...MyClassType* c = new ( (MyClassType*) _mm_malloc( sizeof( MyClassType ) ) MyClassType; >>... Intrinsic function _mm_malloc is always used with two arguments, and alignment boundary value is missing in above example. It should look like: >>... >>...MyClassType* c = new ( (MyClassType*)_mm_malloc( sizeof(MyClassType), 64 ) MyClassType; >>... However, it will create aligned object of type MyClassType and it does Not guarantee that a member of the class, that is double **p ( and all the rest memory blocks ), will be aligned on 64-byte boundaries.
SergeyKostrov
Valued Contributor II
98 Views

In case of pure C++ solution a class could look like: class CAlignedObject { public: CAlignedObject( RTvoid ) { //... }; virtual ~CAlignedObject( RTvoid ) { //... }; RTvoid * operator new( RTsize_t nSize, RTusize_t nAlignment ) { //... }; RTvoid operator delete( RTvoid *pObject, RTusize_t nAlignment ) { //... }; protected: //... }; Take into account that number of arguments of delete C++ operator should match to the number of new C++ operator.
SergeyKostrov
Valued Contributor II
98 Views

In case of pure C++ solution a class could look like: class CAlignedObject { public: CAlignedObject( RTvoid ) { //... }; virtual ~CAlignedObject( RTvoid ) { //... }; RTvoid * operator new( RTsize_t nSize, RTusize_t nAlignment ) { //... }; RTvoid operator delete( RTvoid *pObject, RTusize_t nAlignment ) { //... }; protected: //... }; Take into account that number of arguments of delete C++ operator should match to the number of new C++ operator.
SergeyKostrov
Valued Contributor II
98 Views

A more powerful solution, in terms of software engineering, is a template based approach and in overall it looks like this: template < class T > class _RTALIGNED TDataSet { public: TDataSet( RTvoid ) { //... }; TDataSet( const TDataSet< T > &rtDs ) { //... }; virtual ~TDataSet( RTvoid ) { //... }; _RTINLINE RTbool InitData( RTssize_t iM, RTssize_t iN ) { //...Allocate aligned memory for m_ptData1D and m_ptData2D pointers }; //... protected: //... _RTALIGNED T * _RTRESTRICT m_ptData1D; _RTALIGNED T ** _RTRESTRICT m_ptData2D; RTssize_t m_iRows; RTssize_t m_iCols; //... }; 1. A pointer m_ptData1D allows 1-D access to data. 2. A pointer m_ptData2D allows 2-D access to data. 3. Initialization of the template class for 1-D-and-2-D case is as follows: ... TDataSet< RTfloat > tDs; tDs.InitData( 16384, 16384 ); ... 4. Initialization of the template class for 1-D-and-2-D-and-3-D case is as follows: ... TDataSet< __m512 > tDs; tDs.InitData( 16384, 16384 ); ... and in that case in Z direction there will be 16 float ( Single Precision Floating Point ) values. Dimensions of the data set are 16384x16x16384.
SergeyKostrov
Valued Contributor II
98 Views

No idea why Posts #5 and #6 are duplicates...

Reply