Adding vectors with MIC

Anwar_Ludin · ‎07-02-2013

Hello,

Is the following code sample for adding vectors correct? Can I make it even faster using vectorized operations?

void vectorAdd(float*a, float*b, float* r,int size)
{

#pragma offload target(mic) in(a:length(size)) in(b:length(size)) inout(r:length(size))
#pragmaopenmp parallel for shared(a,b,r) private(i)
for(inti=0; i<size; ++i)
{

r = a+b;

}

TimP · ‎07-02-2013

You probably need float * restrict a, float * restrict b, float *restrict r (or one of the ivdep pragmas) to get auto-vectorization. Alignment would help if you make all the OpenMP chunks a multiple of 32.

A single offloaded vector operation like this would spend a majority of the time on data transfer.