- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi guys!
The compiler says this on the above code: Compiler Warning: Vectorized kernel contains loads/stores that cannot be vectorized. This might reduce performance. What loads/stores are causing this? --- Quote Start --- __attribute__((num_simd_work_items(16))) __attribute__((reqd_work_group_size(32,1,1))) __kernel void int_loop(__global const short * restrict a, __constant const bool * restrict b, __global int * restrict group_counters, __local int * restrict local_counter, const int base, const int base2) { __private uint local_id = get_local_id(0); __private uint group_size = get_local_size(0); __private uint global_id = get_global_id(0); __private int g1 = a[base + global_id]; __private int comparison; __private int wi_counter; comparison = (g1 == a[base2 + global_id]); wi_counter = comparison & b[global_id] & (g1 != 0); local_counter[local_id] = wi_counter; for(uint stride = group_size >> 1; stride > 0; stride = stride >> 1) { barrier(CLK_LOCAL_MEM_FENCE); if(local_id < stride) local_counter[local_id] += local_counter[local_id + stride]; } if(local_id == 0) group_counters[get_group_id(0)] = local_counter[0]; } --- Quote End ---Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The likely case is the following part:
if(local_id == 0)
group_counters = local_counter;
SIMD vectorizes operations at work-item level, and coalesces accesses that are consecutive at this level. For this particular access, since the write is done only by one work-item per work-group, it cannot be vectorized/coalesced.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
--- Quote Start --- The likely case is the following part:
if(local_id == 0)
group_counters = local_counter;
SIMD vectorizes operations at work-item level, and coalesces accesses that are consecutive at this level. For this particular access, since the write is done only by one work-item per work-group, it cannot be vectorized/coalesced. --- Quote End --- Always right mate :) Thanks HRZ

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page