Intel Xeon Phi performance doesnot work?

JGrad · ‎02-12-2016

I have a processor Xeon which has a co-processor Xeon Phi. I would like to know if vectorization on Xeon Phi has better or equal perfomance to the processor Xeon. For that I run this code

# include

//usingnamespace std;

int main() {

const int SIZE = 300000;

int x[SIZE];

int y[SIZE];

int z[SIZE];

// Initialize loop

int i;

for (i=0; i < SIZE; i++){

x[i] =i;

y[i] = i * 2;

z[i] = 0;

}

// // Main loop

# pragma omp simd

for (i=0; i < SIZE; i++){

z[i] =x[i] + y[i];

}

return 0;

}

I am compiling that using with mpi and with flags qopenmp and openmp in this way:

$ mpiicc -o example_openmp.mic example1.c -lm -openmp -mmic

$ mpiicc -o example_qopenmp.mic example1.c -lm -qopenmp -mmic

$ mpiicc -o example_openmp example1.c -lm -openmp

$ mpiicc -o example_qopenmp example1.c -lm -qopenmp

The times of the four executables are

$ time ./example_openmp.mic

real 0m0.151s

user 0m0.000s

sys 0m0.030s

$ time ./example_qopenmp.mic

real 0m0.033s

user 0m0.010s

sys 0m0.010s

$ time ./example_openmp

real 0m0.008s

user 0m0.000s

sys 0m0.005s

$ time ./example_qopenmp

real 0m0.007s

user 0m0.001s

sys 0m0.003s

In other words, run the executables in mic has a bad performance respect to the host Xeon. My question is How Do I get the efficiency of the intel coprocessors?