- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I'm new to using Intel's OpenCL SDK for FPGAs .
I'm developing a single work-item kernel and I'm experimenting with code changes and would like to see the effect of these changes on my total kernel pipeline latency (in terms of clock cycles ) As recommended in the programming guide I use :aoc -c mykernel.cl -report
this generates a report.html , under the System Viewer I also see the different Latencies for each block of code and loop initiation intervals. However if the workload I want to run is known at kernel compile time I would like to estimate the total number of clock cycles needed to complete a given task. (for example I hardcode the number of iterations that a loop is supposed to execute so at compile time it is known how many iterations a loop will execute ) Would really appreciate if anyone can help share their knowledge about this or if there is any suggestion so that I can refine my question . Thanks :D
Link Copied
5 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For a pipeline depth of P, a loop trip count of L, and an initiation interval of I, the execution time of the pipeline in clock cycles is:
P + (L - 1) * I You can calculate the execution time of each block separately using this formula, and just add them up to get the total number of cycles. Dividing that by operating frequency will give you the total run time in seconds. Of course this does not take stalls from channel and global memory operations into account. It not easy to model such stalls.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @HRZ ,
Really appreciate your quick response !! Thank you for the tip , I was hoping there was some way to do this automatically with the offline compiler/ quartus toolset as it gets really tricky for complex models (as you said) :(- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@HRZ really appreciate your quick response !
I was kind of hoping there was a way to extract this from the offlince compiler / generated files/quartus project as this gets tricky for complex designs with stalls (as you said) :(- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@HRZ really appreciate your quick response !
I was kind of hoping there was a way to extract this from the offlince compiler / generated files/quartus project as this gets tricky for complex designs with stalls (as you said) :(- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Altera does not provide an automated way of doing this but this can be automated manually using static/dynamiccode analysis and parsing the info from the report using a custom program.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page