- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi All,
We have implemented 4 kernels in one .cl file. We are trying to optimize the kernels. So gone through AOCL best practices guide, it suggests the usage of restrict keyword in pointer arguments whenever possible. So we have used for all 4 kernels. But resource utilization has increased from 58% to 135%. Instead, if we use for one kernel, then it is giving performance (kernel execution time is decreased from 98msec to 50msec). Is there any alternative for restrict keyword? ThanksLink Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Restrict should not increase resource usage that much unless without restrict, you get fully sequential operation, while with restrict you get pipelined operation with a high II which then requires extra resources to buffer data and accommodate the high II. Can you post your kernel area report before and after adding restrict?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please find the below resource utilisation numbers with and without restrict
With Restrict Family : Arria 10 Device : 10AX115H3F34E2SG Timing Models : Final Logic utilization (in ALMs) : 258,918 / 427,200 ( 61 % ) Total registers : 456008 Total pins : 155 / 618 ( 25 % ) Total virtual pins : 0 Total block memory bits : 19,882,664 / 55,562,240 ( 36 % ) Total RAM Blocks : 2,312 / 2,713 ( 85 % ) Total DSP Blocks : 192 / 1,518 ( 13 % ) Total HSSI RX channels : 4 / 24 ( 17 % ) Total HSSI TX channels : 4 / 24 ( 17 % ) Total PLLs : 30 / 80 ( 38 % ) Without Restrict Family : Arria 10 Device : 10AX115H3F34E2SG Timing Models : Final Logic utilization (in ALMs) : 249,039 / 427,200 ( 58 % ) Total registers : 453705 Total pins : 155 / 618 ( 25 % ) Total virtual pins : 0 Total block memory bits : 12,301,448 / 55,562,240 ( 22 % ) Total RAM Blocks : 1,755 / 2,713 ( 65 % ) Total DSP Blocks : 192 / 1,518 ( 13 % ) Total HSSI RX channels : 4 / 24 ( 17 % ) Total HSSI TX channels : 4 / 24 ( 17 % ) Total PLLs : 30 / 80 ( 38 % )- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was talking about the compiler's resource estimation report which includes the loop analysis and line-by-line estimated area usage and shows the estimated area usage is going to increase from "58% to 135%".
The post-place-and-route area utilization is not much different in your case anyway.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page