- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a processor Xeon which has a co-processor Xeon Phi. I would like to know if vectorization on Xeon Phi has better or equal perfomance to the processor Xeon. For that I run this code
# include
# include
//usingnamespace std;
int main() {
const int SIZE = 300000;
int x[SIZE];
int y[SIZE];
int z[SIZE];
// Initialize loop
int i;
for (i=0; i < SIZE; i++){
x[i] =i;
y[i] = i * 2;
z[i] = 0;
}
// // Main loop
# pragma omp simd
for (i=0; i < SIZE; i++){
z[i] =x[i] + y[i];
}
return 0;
}
I am compiling that using with mpi and with flags qopenmp and openmp in this way:
$ mpiicc -o example_openmp.mic example1.c -lm -openmp -mmic
$ mpiicc -o example_qopenmp.mic example1.c -lm -qopenmp -mmic
$ mpiicc -o example_openmp example1.c -lm -openmp
$ mpiicc -o example_qopenmp example1.c -lm -qopenmp
The times of the four executables are
$ time ./example_openmp.mic
real 0m0.151s
user 0m0.000s
sys 0m0.030s
$ time ./example_qopenmp.mic
real 0m0.033s
user 0m0.010s
sys 0m0.010s
$ time ./example_openmp
real 0m0.008s
user 0m0.000s
sys 0m0.005s
$ time ./example_qopenmp
real 0m0.007s
user 0m0.001s
sys 0m0.003s
In other words, run the executables in mic has a bad performance respect to the host Xeon. My question is How Do I get the efficiency of the intel coprocessors?
- Tags:
- Intel® Xeon®
Link Copied
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page