- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am experiencing unexpected behavior when using clWaitForEvents or clFinish. Following is my structure of host code during kernel launch.
for (i = 0; i<10;i++) { Set kernel Args ... .. .. clEnqueueNDRange(queue1, Kernel1, ....., &event1); clEnqueueTask(queue2, kernel2,............&event2); clEnqueueTask(queue3,kernel3,..........&event3); start = time(); clFinish(queue3); or clWaitforevents(..., event3); end = time(); Run time = end() - start();
event1_time += getKernelStartEndTime(event1); event2_time += getkernelstartendtime(event2);
event3_time += getKernelStartEndTime(event3); Release events... .... ... } I obtain Runtime (8ms) more than event3_time (4ms) in every iteration (factor of 2). I have tried using clFinish() as recommended in https://www.alteraforum.com/forum/showthread.php?t=56633. But I obtain the same results. This doesnt let me launch the kernels of the next iteration just after the previous. There is lot of time gap between kernel launch of one iteration and next. Is it common to obtain high overhead? Am I missing anything while launching kernels?
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Mistake from my side - I havent removed the -profile option while compiling.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page