- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I would like to say if the emulator considers the use of multiple compute units. I do not have a physical board for working, so if I had more than one compute unit, how does the emulator work? Thanks for your helpLink Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you please clarify what you mean by "multiple compute units"?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, of course. I mean the use of more than one compute unit with command "num_compute_unit(N)". In this way, I should be able to compute different Work-Groups simultaneously. I do not know if it is possible with the emulator. Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
num_compute_unit(N) is a basic compiler feature for NDRange kernels and it is fully supported in the emulator. The users does not need to change anything in the host or kernel code, other than adding the associated attribute. The compiler will automatically handle pipeline replication and distribution of work-groups over the multiple compute units. Functionally, the emulator behaves in the exact same way as the actual hardware does. However, you should not expect to see a speed-up in the emulator by using this attribute, since the emulator is not timing-accurate.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok thanks but when I open report.html to see the details about the kernel, in the compute unit section the number of CUs is always 1 even though I use the attribute to increase them. I refer to Vector Add example in which I used more than one work-group. Should this number change according to the number of CUs I use?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think it should. However, the vector_add example only uses one work-group and hence (no get_local_id() in the kernel), there is no point in using num_compute_units. I guess that is the reason why the number of CUs does not increase in the report. Though, for some reason, the area utilization goes up when you increase the number of CUs, which means the compiler is changing the circuit.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I launched the "Hello World" example on emulator and it prints information about the emulator itself. Seeing the results, it seems that the emulator supports only one compute unit: "CL_DEVICE_MAX_COMPUTE_UNITS=1". What do you think? Thanks for your help. https://www.alteraforum.com/forum/attachment.php?attachmentid=14900
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The CL_DEVICE_MAX_COMPUTE_UNITS value reported by the emulator or the FPGA itself is always "1", regardless of how many compute units you have in your kernel. This value depends on the characteristics of the OpenCL device (and not the kernel running on it) and it has no meaning in the particular case of FPGAs, since these devices do not have a fixed architecture.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page