Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16661 Discussions

The effect of conditional branch on performance

Altera_Forum
Honored Contributor II
992 Views

Hi all, 

 

According to the Altera aocl optimization guide, forwarding branch (if-then-else) will not impact the performance negatively. I am trying to see how it will affect the actual performance by using vectorAdd kernel. 

 

The two version codes are like 

 

version 1 (no branch): 

__kernel void 

__attribute((reqd_work_group_size(256,1,1))) 

__attribute((num_simd_work_items(4))) 

vectorAdd(__global const uint *x, 

__global const uint *y, 

__global uint *restrict z) 

// get index of the work item 

int index = get_global_id(0); 

// add the vector elements 

z[index] = x[index] + y[index]; 

 

version 2 (conditional branch): 

__kernel void 

__attribute((reqd_work_group_size(256,1,1))) 

vectorAdd(__global const uint *x, 

__global const uint *y, 

__global uint *restrict z) 

// get index of the work item 

int index = get_local_id(0); 

int block_id = get_group_id(0); 

// add the vector elements 

if(index < 64) { 

z[4*index+256*block_id] = x[4*index+256*block_id] + y[4*index+256*block_id]; 

z[4*index+1+256*block_id] = x[4*index+1+256*block_id] + y[4*index+1+256*block_id]; 

z[4*index+2+256*block_id] = x[4*index+2+256*block_id] + y[4*index+2+256*block_id]; 

z[4*index+3+256*block_id] = x[4*index+3+256*block_id] + y[4*index+3+256*block_id]; 

 

 

I manually disable part of the work-groups by adding a condition. If an forwarding-branch do not hurt the performance then these 2 versions should lead to similar result, right? 

But experiments show version 1 is much better than version 2 (5X~ faster). Then could I know if I misunderstand something about what the optimization guide saying about "forwarding branch will not impact the performance negatively"? Or there are some other tricks in the codes leading to different results? 

 

Thanks.
0 Kudos
2 Replies
Altera_Forum
Honored Contributor II
275 Views

What happens if you also specify the num_simd_work_items for version 2? 

 

--- Quote Start ---  

 

__attribute((num_simd_work_items(4))). 

--- Quote End ---  

0 Kudos
Altera_Forum
Honored Contributor II
275 Views

Also have you changed the NDRange in your host code?

0 Kudos
Reply