Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7957 Discussions

How to tell the compiler an array is aligned?

Intel_C_Intel
Employee
713 Views

Hi guys,

I've read many time that data alignment is critical for maximising speed, but I am not quite sure how tocommunicate that information tothe compiler. I've attached a file that highlights the main characteristics of my app.

In a nutshell, I am dealing with images, which are represented by a class that contains a pointer to the data. This pointer is allocated with _aligned_malloc(), and each row is also padded to a multiple of "MemoryAlignment"(=64) so that the begininning of each row is also aligned. I access the 2D data in a standard way, i.e. 2 nested loops (for all rows, for all cols, ...)

How can I tell the compiler that all data are aligned?

Is it possible/necessary to inform the compiler that each row is aligned? Or is it sufficient to simply state before the nested loops that myImage.data is aligned?

Thanks in advance

Alex

0 Kudos
7 Replies
finjulhich
Beginner
713 Views

I believe

__declspec(aligned(N)) float FARR;

for e.g. will align on N boundary. I think N (a power of 2) is limited though.

In x86 processors with MMX registers, many instrunctions require that their operands be aligned in memory otherwise they generate a general exception. Some instructions have their unaligned equivalent, but they are much slower.

If you use a 2d array, all its elements are adjacent, and all the rows will be aligned only if the row size is = N.

Intel software optimization manuals have examples,

0 Kudos
Intel_C_Intel
Employee
713 Views
Thanks for the answer, but the arrays are dynamically allocated.
0 Kudos
TimP
Honored Contributor III
713 Views
Do you mean
#pragma vector aligned
?
This is equivalent to
#pragma vector always
with the additional assertion of aligned arrays. So it will attempt to vectorize the following for() without applying cost-benefit analysis, and without allowing for unaligned arrays. The first element of each array section in use in the loop must be aligned.
I doubt it will work on outer loops.
0 Kudos
Intel_C_Intel
Employee
713 Views

>> The first element of each array section in use in the loop must be aligned.

Thanks,this answers my question perfectly.

Just one more to completely satisfy my curiosity: I include "#pragma vector aligned",but it did not improve the speed. Would it be correct to conclude that previously, without this pragma, the same machine code was used anyway, i.e. load instructions for aligned arrays?

Then is is correct to say that without that pragma, 2 version of the inner loop are generated, one for aligned arrays, and one for unaligned, and the correct version is determined at run-time? And if that pragma is included, only the aligned version is generated?

Thanks again,

Alex

0 Kudos
TimP
Honored Contributor III
713 Views
Yes, if a loop is vectorized, without the compiler knowing about alignment, there may be multiple vector code versions, with run time selection according to alignment. It is certainly possible that the pragma simply suppressed a version which you aren't using. You could see this clearly by comparing -S compilation results.
0 Kudos
levicki
Valued Contributor I
713 Views

There is also a chance that the code cannot be (machine) vectorized. In that case #pragma won't help — you could try to do it by hand if it comes to that.

0 Kudos
JenniferJ
Moderator
713 Views
I should say that IntelC supports"_aligned_malloc()" from VC, provides _mm_malloc() from IntelC.
0 Kudos
Reply