- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello all,
I'm playing with some SYCL code and I get this exception every time I try to access buffers. I've tried to change selectors to check if anything changes, but no. Every time I try to access buffers by indexing, I get that. Commenting buffer access, kernel runs fine (of course, not doing the work I intend to do).
This is my kernel:
vector<uint8_t> arr(m*n);
// Create buffer from data
cl::sycl::buffer<uint8_t, 2> buf(data, cl::sycl::range<2>{m, n});
cl::sycl::buffer<uint8_t, 2> res(arr.data(), cl::sycl::range<2>{m, n});
cl::sycl::gpu_selector selector;
auto dev = selector.select_device();
// Device information
cout<<"Name: "<<dev.get_info<cl::sycl::info::device::name>()<<endl;
cout<<"Version: "<<dev.get_info<cl::sycl::info::device::version>()<<endl;
cout<<"Vendor: "<<dev.get_info<cl::sycl::info::device::vendor>()<<endl;
cout<<"Driver version: "<<dev.get_info<cl::sycl::info::device::driver_version>()<<endl;
auto edge = cl::sycl::range<2>{m-2, n-2};
try
{
cl::sycl::queue queue(selector);
auto bu = buf.get_access<cl::sycl::access::mode::read>();
auto re = res.get_access<cl::sycl::access::mode::write>();
queue.submit([&](cl::sycl::handler &h)
{
h.parallel_for<class edge>(cl::sycl::range<2>(m-2, n-2),
cl::sycl::id<2>(1, 1),
[=](cl::sycl::id<2> idx)
{
//cout<<"("<<idx[0]<<")("<<idx[1]<<") ";
//auto sumf = 254.0;
auto sumf = (bu[{idx[0]-1, idx[1]-1}] +
bu[{idx[0], idx[1]-1}] +
bu[{idx[0]+1, idx[1]-1}] +
bu[{idx[0]-1, idx[1]}] +
bu[{idx[0], idx[1]}] +
bu[{idx[0]+1, idx[1]}] +
bu[{idx[0]-1, idx[1]+1}] +
bu[{idx[0], idx[1]+1}] +
bu[{idx[0]+1, idx[1]+1}]
)/9;
auto sum = static_cast<uint8_t>(sumf);
re[idx] = static_cast<uint8_t>(std::min(std::max(0, sum - (9 * bu[idx])), 255));
});
});
queue.wait();
} // end try
catch (std::exception& ex)
{
std::cerr<<"exception caught: "<< ex.what() << std::endl;
return 1;
}
Anything am I clearly missing?
If I can't access buffers the way I'm trying to I would appreciate suggestions.
Thanks.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for reaching out to us.
We are moving your query to Intel® oneAPI Data Parallel C++ forum for a faster response.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vitor,
We have skimmed through the code snippet shared by you.
Could you please try changing the syntax of the way you are trying to access the accessor?
From
bu[{idx[0]-1, idx[1]-1}]
to
bu[idx[0]-1][idx[1]-1]
and similarly at other places too wherever you are using the accessor.
If you still face the same error, please share the complete source code, error logs and steps to reproduce if possible.
Have a Good day!
Thanks & Regards
Goutham
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Goutham,
Thanks for your reply.
Well, I could progress on the error (-59 CL_INVALID_OPERATION): the problem was that my buffers AND my accessors were out of scope. A simply moved them into the queue.submit call and added the handler accessors definition.
I also had to capture by value on the lambda inside my parallel for.
These steps made my former code work.
However, I'm glad you replied because after that I got stuck in a different way. I slightly changed my code in order to have my kernel is a library and defined as a class. No compilation issues. However, when I try to run my code I get the following error (the tar.gz file has the new code, logs and the image I use to test it. It is also changed in the way you suggested, just in case. Code compiles fine, but the error is still there also with the new syntax.)
pure virtual method called
terminate called without an active exception
/var/spool/torque/mom_priv/jobs/721101.v-qsvr-1.aidevcloud.SC: line 5: 1631 Aborted ./main form.png
I rapidly debugged it and realized that the error arises when I call queue.submit.
My naive guess is that there is something wrong with defining my class at the parallel_for. But it is just a blind guess, since my class definition is absolutely basic and can't see whats wrong.
Thank you very much for your help.
Regards,
Vitor
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vitor,
Thanks for providing the source code and the logs, we looked into it.
Could you please bring buffers out of the queue.submit scope and try running the code?
Also, I'm provided with the edited version of the code which we tested and executing without any errors, just in case.
// bufferlibclass.cpp
#define STB_IMAGE_IMPLEMENTATION
#define STB_IMAGE_WRITE_IMPLEMENTATION
#include"bufferlibclass.h"
//#include"../stb/stb_image.h"
//#include"../stb/stb_image_write.h"
using namespace std;
size_t cc::height()
{
return this->h;
}
size_t cc::width()
{
return this->w;
}
int cc::image()
// int cc::image(size_t m, size_t n, uint8_t* data, uint8_t* target)
{
float pi = 0.0;
// vector<uint8_t> arr(m*n);
cl::sycl::default_selector selector;
//cl::sycl::device dev = selector.select_device();
auto dev = selector.select_device();
// Device information
cout<<"Name: "<<dev.get_info<cl::sycl::info::device::name>()<<endl;
cout<<"Version: "<<dev.get_info<cl::sycl::info::device::version>()<<endl;
cout<<"Vendor: "<<dev.get_info<cl::sycl::info::device::vendor>()<<endl;
cout<<"Driver version: "<<dev.get_info<cl::sycl::info::device::driver_version>()<<endl;
// cout<<"Image support: "<<dev.get_info<cl::sycl::info::device::image_support>()<<endl;
// cout<<"Max mem alloc size: "<<dev.get_info<cl::sycl::info::device::max_mem_alloc_size>()<<endl;
// cout<<"Vendor ID: "<<dev.get_info<cl::sycl::info::device::vendor_id>()<<endl;
try
{
cl::sycl::queue queue(selector);
cl::sycl::buffer<uint8_t, 2> buf(data, cl::sycl::range<2>{w, h}); //EDITED
cl::sycl::buffer<uint8_t, 2> res(result, cl::sycl::range<2>{w, h}); //EDITED
queue.submit([&](cl::sycl::handler &hand)
{
// Try to create buffer from data
// cl::sycl::buffer<uint8_t, 2> buf(data, cl::sycl::range<2>{w, h});
// cl::sycl::buffer<uint8_t, 2> res(result, cl::sycl::range<2>{w, h});
auto edge = cl::sycl::range<2>{w-2, h-2};
// Os accessors têm que estar dentro do submit (e referenciando o handler)
auto bu = buf.get_access<cl::sycl::access::mode::read>(hand);
auto re = res.get_access<cl::sycl::access::mode::write>(hand);
// Usa essa implementação do parallel_for. Nela tem o parametro (segundo)
// que permite colocar um offSet. Exatamente o que precisa pra não usar as bordas
//
// template <typename KernelName,
// typename KernelType, int dimensions>
// void parallel_for(range<dimensions> numWorkItems,
// id<dimensions> workItemOffset, KernelType kernelFunc);
hand.parallel_for<class edge>
(cl::sycl::range<2>(w-2, h-2),
cl::sycl::id<2>(1, 1),
[=](cl::sycl::id<2> idx)
{
// idx[0] varia m
// idx[1] varia n
// if(bu[idx]>0) re[idx] = 1; //bu[idx];
// re[idx] = bu[idx];
//cout<<"("<<idx[0]<<")("<<idx[1]<<") ";
// Seguindo o Cel%{erity fazer um laplaciano
float sumf = (bu[idx[0]-1][idx[1]-1] +
bu[idx[0]][idx[1]-1] +
bu[idx[0]+1][idx[1]-1] +
bu[idx[0]-1][idx[1]] +
bu[idx[0]][idx[1]] +
bu[idx[0]+1][idx[1]] +
bu[idx[0]-1][idx[1]+1] +
bu[idx[0]][idx[1]+1] +
bu[idx[0]+1][idx[1]+1]
)/9;
int sum = static_cast<uint8_t>(sumf);
re[idx] = static_cast<uint8_t>(std::min(std::max(0, sum - (9 * bu[idx])), 255));
});
});
queue.wait();
} // end try
catch (std::exception& ex)
{
std::cerr<<"exception caught: "<< ex.what() << std::endl;
return 1;
}
return 0;
}
<output>
Please let us know if this resolves your issue.
Have a Good day!
Thanks & Regards
Goutham
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Goutham,
It works like charm, thanks.
Still need to learn more about accessors, buffers and theirs scopes, but today I moved one step further.
All the best,
Vitor
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vitor,
Glad to know that your issue is resolved!
To get started you can look at the below playlist of oneAPI.
As this issue has been resolved, we will no longer respond to this thread.
If you require any additional assistance from Intel, please start a new thread.
Any further interaction in this thread will be considered community only.
Thanks & Regards
Goutham
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page