Intel® High Level Design
Support for Intel® High Level Synthesis Compiler, DSP Builder, OneAPI for Intel® FPGAs, Intel® FPGA SDK for OpenCL™
Announcements
Intel Support hours are Monday-Fridays, 8am-5pm PST, except Holidays. Thanks to our community members who provide support during our down time or before we get to your questions. We appreciate you!

Need Forum Guidance? Click here
Search our FPGA Knowledge Articles here.
505 Discussions

Question about system viewer in openCL html report ???

zjinf
New Contributor I
1,144 Views

Quartus prime pro 17.1

board: a10gx development board

ubuntu 16.04

 

OpenCL kernel was running successfully on my PC. and some thing in the *.html report seems vague for me !!!! So i upload the report and question

in png photo! and also any document explain more detail about the system viewer for recommendation ?????

(BTW, the aocl best practices guide only show that it has the function , but no detail explain about how it run with kernel cycle )

 

any help would be appreciated !

 

2019-01-17_150929___1.png

 

2019-01-17_151814__3.png2019-01-17_151505___2.png

 

2019-01-17_152950__4.png

 

 

0 Kudos
1 Reply
HRZ
Valued Contributor II
267 Views

This information is not explained properly (or at all) in Intel's documents. The following is MY understanding of the meaning of this information but it might NOT be necessarily true/correct:

 

  • The "latency" of each block shows the depth of the pipeline generated for that block. This is NOT the time it takes to execute the block because that time, apart from latency, also depends on II, the loop trip count (which is not necessarily known at compile-time) and possible run-time stalls. Latency for Loads/Stores points to the depth of the pipeline the compiler generates to absorb stalls from these accesses; this basically shows the minimum latency of these operations. If the operation takes longer at run-time, then the pipeline will stall. Pipeline depth/latency is not directly controllable by the user but it can be reduced by simplifying the loop body/operation.
  • I think "starting cycle" for an operation refers to the minimum latency from the start of the kernel execution until that specific operation is reached. This will be in the case all loop trip counts is one and no stalls happen from memory/channel accesses.
  • Latency of local memory accesses is obviously less than global memory accesses.
Reply