- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I wrote a code based on the sample vector-add (https://github.com/intel/BaseKit-code-samples/blob/master/DPC%2B%2BCompiler/vector-add/src/vector-add.cpp) that tracks the execution time of a kernel in SYCL.
The code works for CPU, but when I switch the selector to GPU I receive the following error message:
terminate called after throwing an instance of 'cl::sycl::invalid_object_error' what(): Profiling info is not available. 0 (CL_SUCCESS)
Is this behavior expected? the GPU selected is the following: Device: Intel(R) Gen9 HD Graphics NEO
The related code that I am using is:
auto propList = cl::sycl::property_list{ cl::sycl::property::queue::enable_profiling() }; gpu_selector selector; std::unique_ptr<queue> device_queue; device_queue.reset( new queue(selector, propList) ); e Event = device_queue->submit([&](handler &cgh){ auto accessorA = bufA.get_access<access::mode::read>(cgh); auto accessorB = bufB.get_access<access::mode::read>(cgh); auto accessorC = bufC.get_access<access::mode::write>(cgh); cgh.parallel_for<class VectorAdd>(range<1>(DATA_SIZE), [=](id<1> wiID) { accessorC[wiID] = accessorA[wiID] + accessorB[wiID]; }); }); e.wait(); cl_ulong execution_start = e.template get_profiling_info<info::event_profiling::command_start>();
From gdb-oneapi in DevCloud I get the following information:
#0 0x00007ffff6dd6e97 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007ffff6dd8801 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007ffff7d4407d in __gnu_cxx::__verbose_terminate_handler () at ../../../../gcc-5.5.0/libstdc++-v3/libsupc++/vterminate.cc:95 #3 0x00007ffff7cd9186 in __cxxabiv1::__terminate (handler=<optimized out>) at ../../../../gcc-5.5.0/libstdc++-v3/libsupc++/eh_terminate.cc:47 #4 0x00007ffff7cd91d1 in std::terminate () at ../../../../gcc-5.5.0/libstdc++-v3/libsupc++/eh_terminate.cc:57 #5 0x00007ffff7cda059 in __cxxabiv1::__cxa_throw (obj=0x7849e0, tinfo=0x7ffff7dc5710 <typeinfo for cl::sycl::invalid_object_error>, dest=0x7ffff7c871d0 <cl::sycl::exception::~exception()>) at ../../../../gcc-5.5.0/libstdc++-v3/libsupc++/eh_throw.cc:87 #6 0x00007ffff7c9171f in cl::sycl::info::param_traits<cl::sycl::info::event_profiling, (cl::sycl::info::event_profiling)4738>::return_type cl::sycl::detail::event_impl::get_profiling_info<(cl::sycl::info::event_profiling)4738>() const () from /opt/intel/inteloneapi/compiler/latest/linux/lib/libsycl.so #7 0x00007ffff7cc7e25 in cl::sycl::info::param_traits<cl::sycl::info::event_profiling, (cl::sycl::info::event_profiling)4738>::return_type cl::sycl::event::get_profiling_info<(cl::sycl::info::event_profiling)4738>() const ()
- Tags:
- General Support
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi George,
Thanks for reaching out to us.
We are working on this issue and will get back to you.
Kaleem
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi George,
There is error in code.
Can you please share modified version of source code so that we can try to reproduce this Issue.
Kaleem
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Kaleem,
As I was cleaning my code to upload here, I noticed that the 3 arrays that I've created had 1 billion integers each. When I reduced the size of the arrays to 100 million integers, the code worked well.
I have noticed that the original vector-add sample does not handle problems during the execution, maybe the vector size was to big to be handle by the GPU than the execution failed. On the CPU execution I had no problem as 3 GB is not an issue for the Xeon servers in DevCloud.
What is the data size limit to use the GPUs "Intel(R) Gen9 HD Graphics NEO" as part of the Xeon E-2176G?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi George,
Max memory allocation on GPUs "Intel(R) Gen9 HD Graphics NEO" is 4294959104(4GiB).
You can use "clinfo" command to see the GPUs "Intel(R) Gen9 HD Graphics NEO" specification.
Please let us know if the solution provided helped.
Kaleem
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Kaleem,
That solved my problem. I created 3 arrays with 4GB each without any issue. Looks like one should be able to allocate 4GB per buffer and up to a total combined of 50GB.
Best regards.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, George
Glad to hear that the solution provided helped. We are closing this thread as the issue got resolved. Feel free to raise a new thread in case of any further issues
Have a good day!
Regards,
Goutham
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page