Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Royi
Novice
86 Views

Optimization of memcpy Use - Tell Compiler all Pointers are Aligned

Hello,

I have a simple loop as following:

#pragma omp parallel for
__assume_aligned(mO, 32);
__assume(numColsPad % 32 == 0);
__assume_aligned(vO, 32);
for (ii = 0; ii < numRows; ii++)
	{
		memcpy(&mO[ii * numColsPad], vO, numCols * sizeof(float));
	}

Though I tell the compiler all information needed to use the optimized memcpy it complains the destination should be aligned.
Is there a way to tell it that mO[ii * numColsPad] is aligned for any ii within any thread?

0 Kudos
7 Replies
TimP
Black Belt
86 Views

Did you try assigning a name to the calculated pointer and asserting alignment of its target? The main point of the assertions would be to skip the alignment adjustments of generic memcpy, so it would be most effective for short moves.

jimdempseyatthecove
Black Belt
86 Views

You may also need to declare that the source and destination do not overlap when copied using SIMD. (see restrict and/or loop invariant coding).

Jim Dempsey

jimdempseyatthecove
Black Belt
86 Views

Also, omp has a simd clause.

Jim Dempsey

Royi
Novice
86 Views

Hi Jim

What do you mean by "see restrict and/or loop invariant coding"?
Do you have links to what I should read?
I really wan to know how tell compiler data is independent.
I know I can use #pragma ivdep and the restrict word.
Are there more methods?

In the case above since it is a loop each thread will get its own index of 'ii'.
Using the assume I told the compiler all information needed to infer that for any ii the pointer &mO[ii * numColsPad] will point on aligned data.
I expect the compiler to infer this form the information above.
If it doesn't, it is a basic feature request.

 

jimdempseyatthecove
Black Belt
86 Views

Linux/macOS: -restrict, -no-restrict
Windows: /Qrestrict, /Qrestrict-

Description

This option determines whether pointer disambiguation is enabled with the restrict qualifier. Option -restrict and /Qrestrict enable the recognition of the restrict keyword as defined by the ANSI standard.

By qualifying a pointer with the restrict keyword, you assert that an object accessed by the pointer is only accessed by that pointer in the given scope. You should use the restrict keyword only when this is true. When the assertion is true, the restrict option will have no effect on program correctness, but may allow better optimization.

IDE Equivalent

Visual Studio: Language > Recognize The Restrict Keyword

Eclipse: Language > Recognize The Restrict Keyword

Xcode: Language > Recognize RESTRICT keyword

Also see: https://stackoverflow.com/questions/1965487/does-the-restrict-keyword-provide-significant-benefits-i...

Jim Dempsey

jimdempseyatthecove
Black Belt
86 Views

Note,

The C++ restrict keyword (when enabled) will only (potentially) affect code generated by the compiler. IOW if the code generated makes a function call to __intel_fast_memcpy (spelling) then the restrict keyword has no effect, however note that internally this function makes determinations about potentially adverse situations.

Jim Dempsey

Royi
Novice
86 Views

The above has nothing with Aliasing.
So the restrict in the context you raised has nothing to do with it.

We need someone from the compiler team to notice there is a simple case the Compiler doesn't figure while it has all data needed to figure it out.

Is there a way to make a direct call to Intel's Fast Memcpy?

Reply