Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
17261 Discussions

Work-group size and logic utilization

Altera_Forum
Honored Contributor II
1,396 Views

Hi, 

 

Currently I am doing some experiments with matrix-XOR kernel (similar with altera matrix-multiplication example, just change the multiplication operation to bit-wise exclusive-or). In the code the loop is fully unrolled. I find the work-group size setting has a tremendous affect on logic utilization report. 

 

For example, if the work-group size is set as (64, 64, 1), the logic utilization shown in report is 16%. And when the work group size is (128,128,1), the logic utilization will be 46% which is easy to understand since more bit-wise exclusive-or operations are done in the fully unrolled loop. However when I change the work group size to (80,80,1), the logic utilization will be increase to 123%, which I cannot understand. 

 

Can anyone give some suggestions or recommendations about this phenomenon? Does it mean the compiler prefer work-group size value as power of 2? 

 

Thanks.
0 Kudos
2 Replies
Altera_Forum
Honored Contributor II
521 Views

My guess is that the optimizer fails to do a good job with the size of (80, 80). Can the problem possibly be simplified for powers of two? Have you tried to implement the problem as a single work item kernel? Those tend to be more efficient and the compiler is more predictable.

0 Kudos
Altera_Forum
Honored Contributor II
521 Views

 

--- Quote Start ---  

My guess is that the optimizer fails to do a good job with the size of (80, 80). Can the problem possibly be simplified for powers of two? Have you tried to implement the problem as a single work item kernel? Those tend to be more efficient and the compiler is more predictable. 

--- Quote End ---  

 

 

Thanks for the reply. Actually I want to know if it is OK to construct a local memory (has the same size with work group) whose size is not powers of two. E.g when setting the SIMD as 8 for matrix XOR kernel, a 128 * 128 local memory per work group will use more than 100% memory blocks on FPGA. So I want to know if it is possible to use a 80 * 80 local memory while maintaining SIMD as 8 to utilize more memory blocks on FPGA (but less than 100%)
0 Kudos
Reply