Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
Announcements
Intel Support hours are Monday-Fridays, 8am-5pm PST, except Holidays. Thanks to our community members who provide support during our down time or before we get to your questions. We appreciate you!

Need Forum Guidance? Click here
Search our FPGA Knowledge Articles here.
15323 Discussions

[Kernel vectorization] Loads/Stores cannot be vectorized

Altera_Forum
Honored Contributor II
955 Views

Hi guys! 

 

The compiler says this on the above code: 

 

Compiler Warning: Vectorized kernel contains loads/stores that cannot be vectorized. This might reduce performance. 

 

 

What loads/stores are causing this? 

 

 

 

 

--- Quote Start ---  

 

__attribute__((num_simd_work_items(16))) 

__attribute__((reqd_work_group_size(32,1,1))) 

__kernel void int_loop(__global const short * restrict a, 

__constant const bool * restrict b, 

__global int * restrict group_counters, 

__local int * restrict local_counter, 

const int base, const int base2) 

__private uint local_id = get_local_id(0); 

__private uint group_size = get_local_size(0); 

__private uint global_id = get_global_id(0); 

__private int g1 = a[base + global_id]; 

__private int comparison; 

__private int wi_counter; 

 

 

 

 

comparison = (g1 == a[base2 + global_id]); 

 

 

wi_counter = comparison & b[global_id] & (g1 != 0); 

 

 

local_counter[local_id] = wi_counter; 

 

 

 

 

for(uint stride = group_size >> 1; stride > 0; stride = stride >> 1) 

barrier(CLK_LOCAL_MEM_FENCE); 

 

 

if(local_id < stride) 

local_counter[local_id] += local_counter[local_id + stride]; 

 

 

if(local_id == 0) 

group_counters[get_group_id(0)] = local_counter[0]; 

 

 

 

--- Quote End ---  

0 Kudos
2 Replies
Altera_Forum
Honored Contributor II
99 Views

The likely case is the following part: 

 

if(local_id == 0) group_counters = local_counter; 

 

SIMD vectorizes operations at work-item level, and coalesces accesses that are consecutive at this level. For this particular access, since the write is done only by one work-item per work-group, it cannot be vectorized/coalesced.
Altera_Forum
Honored Contributor II
99 Views

 

--- Quote Start ---  

The likely case is the following part: 

 

if(local_id == 0) group_counters = local_counter; 

 

SIMD vectorizes operations at work-item level, and coalesces accesses that are consecutive at this level. For this particular access, since the write is done only by one work-item per work-group, it cannot be vectorized/coalesced. 

--- Quote End ---  

 

 

Always right mate :) 

 

Thanks HRZ
Reply