Intel® oneAPI DPC++/C++ Compiler
Talk to fellow users of Intel® oneAPI DPC++/C++ Compiler and companion tools like Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and Intel® Distribution for GDB*
724 Discussions

profiling info not available on GPU execution

George_Silva_Intel
3,476 Views

I wrote a code based on the sample vector-add (https://github.com/intel/BaseKit-code-samples/blob/master/DPC%2B%2BCompiler/vector-add/src/vector-add.cpp) that tracks the execution time of a kernel in SYCL.

The code works for CPU, but when I switch the selector to GPU I receive the following error message:

terminate called after throwing an instance of 'cl::sycl::invalid_object_error'
  what():  Profiling info is not available. 0 (CL_SUCCESS)

Is this behavior expected? the GPU selected is the following: Device: Intel(R) Gen9 HD Graphics NEO

The related code that I am using is:

auto propList = cl::sycl::property_list{ cl::sycl::property::queue::enable_profiling() };
gpu_selector selector;
std::unique_ptr<queue> device_queue;
device_queue.reset( new queue(selector, propList) );

e Event = device_queue->submit([&](handler &cgh){
    auto accessorA = bufA.get_access<access::mode::read>(cgh);
    auto accessorB = bufB.get_access<access::mode::read>(cgh);
    auto accessorC = bufC.get_access<access::mode::write>(cgh);
    cgh.parallel_for<class VectorAdd>(range<1>(DATA_SIZE), [=](id<1> wiID) {
        accessorC[wiID] = accessorA[wiID] + accessorB[wiID];
    });
});

e.wait();
cl_ulong execution_start = e.template get_profiling_info<info::event_profiling::command_start>();

From gdb-oneapi in DevCloud I get the following information:

#0  0x00007ffff6dd6e97 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff6dd8801 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff7d4407d in __gnu_cxx::__verbose_terminate_handler () at ../../../../gcc-5.5.0/libstdc++-v3/libsupc++/vterminate.cc:95
#3  0x00007ffff7cd9186 in __cxxabiv1::__terminate (handler=<optimized out>)
    at ../../../../gcc-5.5.0/libstdc++-v3/libsupc++/eh_terminate.cc:47
#4  0x00007ffff7cd91d1 in std::terminate () at ../../../../gcc-5.5.0/libstdc++-v3/libsupc++/eh_terminate.cc:57
#5  0x00007ffff7cda059 in __cxxabiv1::__cxa_throw (obj=0x7849e0, tinfo=0x7ffff7dc5710 <typeinfo for cl::sycl::invalid_object_error>,
    dest=0x7ffff7c871d0 <cl::sycl::exception::~exception()>) at ../../../../gcc-5.5.0/libstdc++-v3/libsupc++/eh_throw.cc:87
#6  0x00007ffff7c9171f in cl::sycl::info::param_traits<cl::sycl::info::event_profiling, (cl::sycl::info::event_profiling)4738>::return_type cl::sycl::detail::event_impl::get_profiling_info<(cl::sycl::info::event_profiling)4738>() const ()
   from /opt/intel/inteloneapi/compiler/latest/linux/lib/libsycl.so
#7  0x00007ffff7cc7e25 in cl::sycl::info::param_traits<cl::sycl::info::event_profiling, (cl::sycl::info::event_profiling)4738>::return_type cl::sycl::event::get_profiling_info<(cl::sycl::info::event_profiling)4738>() const ()

0 Kudos
6 Replies
Kaleem_A_Intel
Employee
3,476 Views

Hi George,

Thanks for reaching out to us.

We are working on this issue and will get back to you.

 

Kaleem

0 Kudos
Kaleem_A_Intel
Employee
3,480 Views

Hi George,

There is error in code.

Can you please share modified version of source code so that we can try to reproduce this Issue.

 

Kaleem

0 Kudos
George_Silva_Intel
3,480 Views

Hello Kaleem,

As I was cleaning my code to upload here, I noticed that the 3 arrays that I've created had 1 billion integers each. When I reduced the size of the arrays to 100 million integers, the code worked well.

I have noticed that the original vector-add sample does not handle problems during the execution, maybe the vector size was to big to be handle by the GPU than the execution failed. On the CPU execution I had no problem as 3 GB is not an issue for the Xeon servers in DevCloud.

What is the data size limit to use the GPUs "Intel(R) Gen9 HD Graphics NEO" as part of the Xeon E-2176G?

0 Kudos
Kaleem_A_Intel
Employee
3,480 Views

Hi George,

Max memory allocation on GPUs "Intel(R) Gen9 HD Graphics NEO" is 4294959104(4GiB).
You can use "clinfo" command to see the GPUs "Intel(R) Gen9 HD Graphics NEO" specification.

Please let us know if the solution provided helped.

 

Kaleem

0 Kudos
George_Silva_Intel
3,480 Views

Thank you Kaleem,

That solved my problem. I created 3 arrays with 4GB each without any issue. Looks like one should be able to allocate 4GB per buffer and up to a total combined of 50GB.

Best regards.

0 Kudos
GouthamK_Intel
Moderator
3,480 Views

Thank you, George

Glad to hear that the solution provided helped. We are closing this thread as the issue got resolved. Feel free to raise a new thread in case of any further issues

Have a good day!

 

Regards,

Goutham

0 Kudos
Reply