- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm trying to profile my application using the clGetEventProfilingInfo(). Rather than just looking at the kernel execution time (CL_PROFILING_COMMAND_END - CL_PROFILING_COMMAND_START) I'd like to know the absolute value of when the kernels were executed, so I can draw a timeline. When using the CPU device this works just fine, but on the GPU device (HD4000) the timer seems to be reset every time.
I wrote a simple program that calls a kernel N times, each time followed by a call to clFinish. This is the information I get from the profiler
0 .. 15120 .. 11875440 .. 18972640
0 .. 6960 .. 917520 .. 7889600
0 .. 5760 .. 148800 .. 7247600
0 .. 5920 .. 159520 .. 7335520
0 .. 5840 .. 154000 .. 7310320
0 .. 5840 .. 199520 .. 7343760
0 .. 5920 .. 156640 .. 7296240
0 .. 6000 .. 148720 .. 7254880
0 .. 6080 .. 150240 .. 7294000
0 .. 6240 .. 149680 .. 7293920
The numbers are the return values of clGetEventProfilingInfo with CL_PROFILING_COMMAND_QUEUED, CL_PROFILING_COMMAND_SUBMIT, CL_PROFILING_COMMAND_START and CL_PROFILING_COMMAND_END, respectively.
As you can see the timer always starts at 0 again. It seems like each event has it's own counter that starts at 0 when the kernel is launched. Why is the behaviour on the GPU different to the behaviour on the CPU?
Dominik
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Considering this post is two years old, I don't really expect a reply, but I just ran into this problem myself and was wondering if there was any resolution?
Thanks, ~ben
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ben,
What processor, OS (including OS version), and driver version are you using? Which version of OpenCL?
Thanks!
Robert
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ben,
I just tried the latest and greatest driver on Windows 8 and even in OpenCL 1.2, five counters are supported: CL_PROFILING_COMMAND_QUEUED, CL_PROFILING_COMMAND_SUBMIT, CL_PROFILING_COMMAND_START, CL_PROFILING_COMMAND_END, and CL_PROFILING_COMMAND_COMPLETE.
I query and print them in the following way and they seem to return reasonable increasing values (at least for QUEUED, START, and COMPLETE).
ciErrNum = clGetEventProfilingInfo(pmy_events, CL_PROFILING_COMMAND_QUEUED, sizeof(cl_ulong), ¶m_command_queued, 0); ciErrNum = clGetEventProfilingInfo(pmy_events, CL_PROFILING_COMMAND_SUBMIT, sizeof(cl_ulong), ¶m_command_submit, 0); ciErrNum = clGetEventProfilingInfo(pmy_events, CL_PROFILING_COMMAND_START, sizeof(cl_ulong), ¶m_command_start, 0); ciErrNum = clGetEventProfilingInfo(pmy_events, CL_PROFILING_COMMAND_END, sizeof(cl_ulong), ¶m_command_end, 0); ciErrNum = clGetEventProfilingInfo(pmy_events, CL_PROFILING_COMMAND_COMPLETE, sizeof(cl_ulong), ¶m_command_complete, 0); printf("iteration: %d, QUEUED: %lu, SUBMIT: %lu, START: %lu, END: %lu, COMPLETE: %lu\n", i, param_command_queued, param_command_submit, param_command_start, param_command_end, param_command_complete);

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page