Intel® oneAPI Data Parallel C++
Support for Intel® oneAPI DPC++ Compiler, Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and GDB*
339 Discussions

## Parallel Version of Code not as efficient as Serial Version of Code Novice
395 Views

Hi there!

I wrote a code for the restriction operator used in multigrid algorithms. Code is given below:

#include <iostream>
#include <CL/sycl.hpp>
#include <vector>

using namespace sycl;

std::vector <float> Restriction2D(std::vector <float>& vec_h) {
int vec_h_dim = int(std::sqrt(vec_h.size()));
int vec_2h_dim = int((vec_h_dim - 1) / 2);
std::vector<float> vec_2h(vec_2h_dim * vec_2h_dim, 0);
for (int i_2h = 1; i_2h <= vec_2h_dim; i_2h++) {

for (int j_2h = 1; j_2h <= vec_2h_dim; j_2h++) {

vec_2h[(i_2h - 1) * vec_2h_dim + j_2h - 1] = (1 / 16) * (vec_h[(2 * i_2h - 1 - 1) * vec_h_dim + 2 * j_2h - 1 - 1] + vec_h[(2 * i_2h - 1 - 1) * vec_h_dim + 2 * j_2h]
+ vec_h[2 * i_2h * vec_h_dim + 2 * j_2h - 1 - 1] + vec_h[2 * i_2h * vec_h_dim + 2 * j_2h] + 2 * (vec_h[(2 * i_2h - 1) * vec_h_dim + 2 * j_2h - 1 - 1] +
vec_h[(2 * i_2h - 1) * vec_h_dim + 2 * j_2h] + vec_h[(2 * i_2h - 1 - 1) * vec_h_dim + 2 * j_2h - 1] + vec_h[2 * i_2h * vec_h_dim + 2 * j_2h - 1]) +
4 * vec_h[(2 * i_2h - 1) * vec_h_dim + 2 * j_2h - 1]);
}
}
return vec_2h;
}

std::vector <float> Restriction2D_parallel(std::vector <float>& vec_h) {
int vec_h_dim = int(std::sqrt(vec_h.size()));
int vec_2h_dim = int((vec_h_dim - 1) / 2);
std::vector<float> vec_2h(vec_2h_dim * vec_2h_dim, 0);
cl::sycl::queue q;
{
buffer <float, 2> vec_2h_buf(vec_2h.data(), range<2>{vec_2h_dim, vec_2h_dim});
buffer <float, 2> vec_h_buf(vec_h.data(), range<2>{vec_h_dim, vec_h_dim});

//float* host_vector_2h = malloc_host<float>(vec_2h_dim, q);
q.submit([&](handler& h) {
accessor vec_2h_acc{ vec_2h_buf , h };
accessor vec_h_acc{ vec_h_buf , h };
program p(q.get_context());
p.build_with_kernel_type<class Restriction>();

h.parallel_for<class Restriction>(p.get_kernel<class Restriction>(),range<2>{ vec_2h_dim, vec_2h_dim}, [=](id<2>idx) {
int i_2h = idx; //0 to vec_2h_dim -1
int j_2h = idx; //0 to vec_2h_dim -1
vec_2h_acc[i_2h][j_2h] = (1 / 16) * (vec_h_acc[2 * i_2h - 1][2 * j_2h - 1] + vec_h_acc[2 * i_2h - 1][2 * j_2h + 1]
+ vec_h_acc[2 * i_2h + 1][2 * j_2h - 1] + vec_h_acc[2 * i_2h + 1][2 * j_2h + 1] + 2 * (vec_h_acc[2 * i_2h][2 * j_2h - 1] +
vec_h_acc[2 * i_2h][2 * j_2h + 1] + vec_h_acc[2 * i_2h - 1][2 * j_2h] + vec_h_acc[2 * i_2h + 1][2 * j_2h]) +
4 * vec_h_acc[2 * i_2h][2 * j_2h]);
});
});
//q.wait();
}
return vec_2h;
}

int main() {
std::size_t size = 11108889;
std::vector<float> test_vec(size, 0.0);
for (int i = 0; i < test_vec.size(); i++) {
test_vec[i] = i / 4.0;
}
std::vector<float>test_vec_restricted = Restriction2D_parallel(test_vec);
//std::cout << test_vec_restricted.size();
return 0;

}

While running the Restriction2D and Restriction2D_parallel, the serial version of the code seems to perform better than the parallel version. I have also attached the results of HPC Vtune analysis for both.

Can someone explain it to me why is this happening? What knowledge am I lacking here?

3 Replies Moderator
364 Views

Hi Nikhil,

Thanks for reaching out to us.

From your code, we can see that you are trying to perform simple operations on vectors and finally adding them to get the desired result, and as it will take a constant time to access the elements and do some simple operations you can very well relate your code as a vector add. And as the complexity of the loops is also not exceeding O(size) your code will take less than a second to complete sequentially.

So this small workload is not quite ideal to compare for sequential and parallel executions. This is the reason for the difference in performance between parallel and sequential executions.

To get good performance stats between sequential and parallel execution you can try increasing your workload and can make your code more compute intensive.

Warm Regards,

Abhishek Moderator
352 Views

Hi Nikhil,

Please give us an update on the provided details.

Warm Regards,

Abhishek Moderator
330 Views

Hi Nikhil,

We haven't heard back from you for a long time, so we are assuming that the provided solution had helped you in solving your issue. So we are no longer monitoring this thread. 