- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
( The following is based on some recent experiments on a GEN8 IGP )
FYI -- one gotcha to watch out for when porting from CUDA to Intel IGP is that the OpenCL barrier()/work_group_barrier() operation doesn't support either work items or subgroups exiting early.
For example, if a subgroup returns early and the remaining work items synchronize in a barrier() then your kernel is going to hang on the IGP.
Early exit of some threads (work items) at the end of a grid is a pretty common use case in CUDA.
Fortunately, OpenCL 2.0 has a feature that doesn't exist in CUDA and it might help you workaround this issue... Non-Uniform Work Groups.
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for this report. We will see how to get updates on this topic into the documentation.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page