Community
cancel
Showing results for 
Search instead for 
Did you mean: 
andry954
Beginner
272 Views

[SYCL][CUDA] unnecessary memcpy for write buffers.

When using write and discard_write accessors with the CUDA backend a Host to Device copy is made. Even though this is supposed to be write only which does not need such an action and this is wasted resources.

queue.submit([&] (cl::sycl::handler& cgh) {
auto input_acc = input.get_accesssycl::access::mode::read(cgh);
auto output_acc = output.get_accesssycl::access::mode::discard_write(cgh);
auto maxRange = sycl::nd_range<2>(sycl::range<2>{height, width / 4}, sycl::range<2>(1, 128));
cgh.parallel_for(maxRange, [=](sycl::nd_item<2> item){
output_acc[item.get_global_id()] = 0;
});
});

Am I doing something wrong or is this a quirk of the current implementation of the SYCL specification?

I have never written an issue like this before so if you need additional information just ask.

Labels (1)
0 Kudos
5 Replies
RahulV_intel
Moderator
237 Views

Hi,


Do you mind sharing your complete source file so as to get a better clarity with respect to your claim?


In general, accessor::read mode is where the copy happens from host to device. Buffers just claim the memory space. Whereas, the actual copy happens when accessors are called.


I would like to see the buffers which you have declared in your program.


Thanks,

Rahul


RahulV_intel
Moderator
217 Views

Hi,


Just a quick reminder to share your complete source code.


Thanks,

Rahul


andry954
Beginner
190 Views

Hi,

I already  got an answer to my question here https://github.com/intel/llvm/issues/1992

thanks for your time.

RahulV_intel
Moderator
179 Views

Hi,


Thanks for providing the link. Good to know that your query got addressed.


Let me know if I can close the thread.


Thanks,

Rahul


RahulV_intel
Moderator
160 Views

Intel will no longer monitor this thread. However, this thread will remain open for community discussion. If you still have any issue, feel free to post a new question.