Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16592 Discussions

Compute units on emulator

Altera_Forum
Honored Contributor II
1,441 Views

Hi, 

I would like to say if the emulator considers the use of multiple compute units. I do not have a physical board for working, so if I had more than one compute unit, how does the emulator work? Thanks for your help
0 Kudos
7 Replies
Altera_Forum
Honored Contributor II
557 Views

Can you please clarify what you mean by "multiple compute units"?

0 Kudos
Altera_Forum
Honored Contributor II
557 Views

Yes, of course. I mean the use of more than one compute unit with command "num_compute_unit(N)". In this way, I should be able to compute different Work-Groups simultaneously. I do not know if it is possible with the emulator. Thanks

0 Kudos
Altera_Forum
Honored Contributor II
557 Views

num_compute_unit(N) is a basic compiler feature for NDRange kernels and it is fully supported in the emulator. The users does not need to change anything in the host or kernel code, other than adding the associated attribute. The compiler will automatically handle pipeline replication and distribution of work-groups over the multiple compute units. Functionally, the emulator behaves in the exact same way as the actual hardware does. However, you should not expect to see a speed-up in the emulator by using this attribute, since the emulator is not timing-accurate.

0 Kudos
Altera_Forum
Honored Contributor II
557 Views

Ok thanks but when I open report.html to see the details about the kernel, in the compute unit section the number of CUs is always 1 even though I use the attribute to increase them. I refer to Vector Add example in which I used more than one work-group. Should this number change according to the number of CUs I use?

0 Kudos
Altera_Forum
Honored Contributor II
557 Views

I think it should. However, the vector_add example only uses one work-group and hence (no get_local_id() in the kernel), there is no point in using num_compute_units. I guess that is the reason why the number of CUs does not increase in the report. Though, for some reason, the area utilization goes up when you increase the number of CUs, which means the compiler is changing the circuit.

0 Kudos
Altera_Forum
Honored Contributor II
557 Views

I launched the "Hello World" example on emulator and it prints information about the emulator itself. Seeing the results, it seems that the emulator supports only one compute unit: "CL_DEVICE_MAX_COMPUTE_UNITS=1". What do you think? Thanks for your help. https://www.alteraforum.com/forum/attachment.php?attachmentid=14900

0 Kudos
Altera_Forum
Honored Contributor II
557 Views

The CL_DEVICE_MAX_COMPUTE_UNITS value reported by the emulator or the FPGA itself is always "1", regardless of how many compute units you have in your kernel. This value depends on the characteristics of the OpenCL device (and not the kernel running on it) and it has no meaning in the particular case of FPGAs, since these devices do not have a fixed architecture.

0 Kudos
Reply