Does autovectorization work on below kernel?
__kernel void vec_add(__global const float* in1, __global const float* in2, __global float* out)
{
int i=get_global_id(0);
int j=(i<<2);
out
out[j+1]=in1[j+1]+in2[j+1];
out[j+2]=in1[j+2]+in2[j+2];
out[j+3]=in1[j+3]+in2[j+3];
}
thanks,
Jeffrey
連結已複製
1 回應
