Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
Announcements
Intel Support hours are Monday-Fridays, 8am-5pm PST, except Holidays. Thanks to our community members who provide support during our down time or before we get to your questions. We appreciate you!

Need Forum Guidance? Click here
Search our FPGA Knowledge Articles here.

Using of Restrict keyword

Altera_Forum
Honored Contributor II
878 Views

Hi All, 

We have implemented 4 kernels in one .cl file. We are trying to optimize the kernels. So gone through AOCL best practices guide, it suggests the usage of restrict keyword in pointer arguments whenever possible. So we have used for all 4 kernels. But resource utilization has increased from 58% to 135%. Instead, if we use for one kernel, then it is giving performance (kernel execution time is decreased from 98msec to 50msec). Is there any alternative for restrict keyword? 

 

Thanks
0 Kudos
3 Replies
Altera_Forum
Honored Contributor II
88 Views

Restrict should not increase resource usage that much unless without restrict, you get fully sequential operation, while with restrict you get pipelined operation with a high II which then requires extra resources to buffer data and accommodate the high II. Can you post your kernel area report before and after adding restrict?

Altera_Forum
Honored Contributor II
88 Views

Please find the below resource utilisation numbers with and without restrict 

 

With Restrict 

Family : Arria 10 

Device : 10AX115H3F34E2SG 

Timing Models : Final 

Logic utilization (in ALMs) : 258,918 / 427,200 ( 61 % ) 

Total registers : 456008 

Total pins : 155 / 618 ( 25 % ) 

Total virtual pins : 0 

Total block memory bits : 19,882,664 / 55,562,240 ( 36 % ) 

Total RAM Blocks : 2,312 / 2,713 ( 85 % ) 

Total DSP Blocks : 192 / 1,518 ( 13 % ) 

Total HSSI RX channels : 4 / 24 ( 17 % ) 

Total HSSI TX channels : 4 / 24 ( 17 % ) 

Total PLLs : 30 / 80 ( 38 % ) 

 

 

Without Restrict 

Family : Arria 10 

Device : 10AX115H3F34E2SG 

Timing Models : Final 

Logic utilization (in ALMs) : 249,039 / 427,200 ( 58 % ) 

Total registers : 453705 

Total pins : 155 / 618 ( 25 % ) 

Total virtual pins : 0 

Total block memory bits : 12,301,448 / 55,562,240 ( 22 % ) 

Total RAM Blocks : 1,755 / 2,713 ( 65 % ) 

Total DSP Blocks : 192 / 1,518 ( 13 % ) 

Total HSSI RX channels : 4 / 24 ( 17 % ) 

Total HSSI TX channels : 4 / 24 ( 17 % ) 

Total PLLs : 30 / 80 ( 38 % )
Altera_Forum
Honored Contributor II
88 Views

I was talking about the compiler's resource estimation report which includes the loop analysis and line-by-line estimated area usage and shows the estimated area usage is going to increase from "58% to 135%". 

The post-place-and-route area utilization is not much different in your case anyway.
Reply