I know that I can use Early Power Estimator or Power Analyzer to analyze the power of RTL code in Quartus.
My question is: now I have an OpenCL code, how to get the power value of OpenCL code? When I use "aoc xx.cl -o xx.aocx" it can automatically generate some RTL codes, but the codes are too complex including some DDR, components, libraries or other verilog files. Can I use Quartus to analyze the automatically generated RTL codes? I tried to add those RTL files to Quartus, however, a lot of errors generated. Does anyone can give me a guidance or example how to correct analyze the generated RTL files from OpenCL code? Thank you so much!
What board are you using? Most Arria 10 boards provide on-board power sensors and depending on the board, you might be able to read the power sensor in your OpenCL host code alongside with kernel execution to measure power consumption accurately.
Well, that is a problem. If you were using Bittware or Nallatech boards, I could give you code to read the power sensor on those boards right away. Intel's reference board also apparently has some GUI application that reports the board power consumption (but I have not used that myself). However, as far as I know, there is no easy way to read power consumption on the Terasic board. Even though their board also has a power sensor, it is not ready to use. You will need to add the necessary logic for reading the sensor in HDL yourself and integrating all that into OpenCL will be difficult. Based on their documentation, they have a NIOS demo that can read the board power consumption but again, that is useless in the context of OpenCL. You can try contacting Terasic's support to see if they provide a library or something to read the power sensor in a C application.
I had success with running quartus_pow on placed-and-routed OpenCL designs and getting estimated power consumption, but that was on Stratix V. I tried it once long ago on Arria 10 but it didn't work.
Thank you for your reply HRZ, is it possible to use the automatically generated RTL codes to analyze the power? Because those RTL codes are very complex, I don't know how to deal with those RTL codes in Quartus. Thanks.
Oh my god HRZ! I followed your comment try to use "quartus_pow", and get the results! It takes more than 8 minutes.
Warning (222013): Relative toggle rates could not be calculated because no clock domain could be identified for some nodes
Info (223001): Completed Vectorless Power Activity Estimation
Info (218000): Using Advanced I/O Power to simulate I/O buffers with the specified board trace model
Info (215049): Average toggle rate for this design is 25.123 millions of transitions / sec
Info (215031): Total thermal power estimate for the design is 27942.52 mW
Info: Quartus Prime Power Analyzer was successful. 0 errors, 96 warning
Info: Peak virtual memory: 14778 megabytes
Info: Processing ended: Sat Oct 20 01:52:26 2018
Info: Elapsed time: 00:08:18
Info: Total CPU time (on all processors): 00:27:38
Do you think this value is reasonable (27.9 W)? Seems a little high, isn't it? And it only shows total power, how to get the separate values such as dynamic, static, memory, I/0 powers? Many thanks.
I am glad it worked for you on Arria 10, at least it is better than nothing. If you sort the OpenCL output folder by date, you should find a file named "*.pow.summary" which includes breakdown of static, dynamic and I/O power. You should also be able to find the full power report.
27.9 Watts is not very high for Arria 10, but that largely depends on the size, activity and operating frequency of your design. I have pushed close to 70 Watts on Arria 10 with large designs running at above 300 MHz. You should note that this power estimation is highly innacurate since it does not consider signal activity (assumes a default 12.5% switching rate), and only includes the FPGA power, while there are a lot of components on the board, especially the DDR memory, which add quite a bit to the power consumption.
In the report, it includes I/O power, transceiver power, static and dynamic power. The I/O power and transceiver power are all around ~6 W, static power is ~6 W, dynamic power is ~9 W. In my understanding, the I/O power is caused by the data transfer between FPGA and DDR3 (DDR3 is the memory on Arria 10). How to understand transceiver power? I don't know what does it exactly mean and what reason caused it. In my OpenCL code, I defined input and output such as (__global int * input_a, __global int * input_b, __global int *output_c).
Thank you so much for your help.
All data from high-speed peripherals outside of the FPGA need to go through the high-speed transceivers inside the FPGA so that data width and signal frequency is adjusted to values that can be processed by the controllers inside the FPGA; this includes PCI-E, DDR memory, network ports, etc. I don't think you would be able to find any meaningful mapping between the way you write your kernel code and the breakdown of the power consumption.