- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone:
Is there a way to figure out the work group to EU mapping? More specifically, is there any registers containing EU IDs available in EU, that the kernel can read during the execution?
Thanks!
Dong
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As far as I know there is no way to do this in OpenCL. While there are many system queries you can do during initialization to learn about the type of system the code is running on, I have not heard of any way that a kernel can learn about the ID if a specific EU it is running on.
What is your final goal? Perhaps if we knew more about that we could provide more options.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Jeffrey:
Thanks for your prompt reply.
The final goal for me is that I would like to figure out how the hardware scheduler schedules the work groups to EUs. If the kernel can read the EU IDs during execution, probably that will solve my problem.
In this afternoon, I found that register SR0 contains EUIDs. If I get the address of that register and inline assembly code into the kernel by __asm__(), possibly that can do it. Not sure yet, I will try to find out the address of SR0 and then see whether this works.
Thanks! Other options and thoughts about this are welcome!
Dong
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Knowing the EU and hardware thread IDs would be a useful feature. It's probably way out of spec for current revisions of OpenCL but might be an easy intel_cl extension?
Note this can be done in CUDA by reading the %smid% special register (but the IDs aren't always sequential).
My observations so far lead me to believe that a GEN workgroup has its subgroups assigned across each EU in the subslice.
That is, if you have a workgroup with 8 subgroups on a Gen8+ IGP then one subgroup will be assigned to each EU in the subslice.
Your mileage may vary.
If you just want controlled scheduling of work onto a partial subslice, then a workaround for what you're asking for might be to implement your kernel with a "grid-stride loop". Note that for some strange reason, a subslice has more total EU threads than you can possibly cover with a single workgroup so you may have to use two workgroups to fully cover a subslice.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page