OpenCL vectorisation issue in OneAPI driver

OCLdev · ‎07-06-2024

(Cross-posting as requested from the toolkit forum.)

The Intel OpenCL drivers (2024.2.0.980 on Windows and 2021.12.6.0.19_160000 on Linux) seem to have an issue with vectorization resulting in corrupted data. This simple kernel:

__kernel void test(__global float *f, __global float *r) {
    int i = get_global_id(0);

    r[i] = 0.0;
    if (f[i] == 1.0F) {
        r[i] = 1.0F+pow(1.0F, 1.0F);
    }
}

when run over a buffer of length 16 with f equal to:

1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1

results in the following values in r:

2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 3126, 0, 0, 0, 0, 3126

Note the odd 3126 values. Turning off vectorisation using

CL_CONFIG_CPU_VECTORIZER_MODE=1

Results in the correct values:

2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 2

The issue seems to be caused by adding a value to the function (i.e. here 1.0+pow()), for example this gives an expected result:

__kernel void test(__global float *f, __global float *r) {
    int i = get_global_id(0);

    r[i] = 0.0;
    if (f[i] == 1.0F) {
        r[i] = pow(1.0F, 1.0F);
    }
}

Code for a MVE is attached.