I am using
clSetKernelArg to pass a struct with cl_float3 values to the kernel.
cl_float3 is the same as cl_float4 and needs to be aligned at 16 byte boundary according to the OpenCL specification.
This works fine with various vendors, but not for Intel due to misalignments.
Please fix the alignment to match the required alignment of the structure in the kernel argument.
Attached you can find an example application. The expected alignment values are the least possible values to work without vload/vstore.
using device: Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz device version: OpenCL 1.2 (Build 63463) driver version: 1.2 result: 4, 4, 4, 8, 16 expected: 4, 16, 16, 4, 16