Thanks for commeting on the

LLess · ‎01-06-2013

Hi, I have a simple test program using an image3d with CL_UNSIGNED_INT8 data in it. When I use the following sampler my test program works fine.

sampler_t trilinear_sampler = CLK_FILTER_LINEAR | CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_CLAMP_TO_EDGE;

But if I switch to :

sampler_t trilinear_sampler = CLK_FILTER_LINEAR | CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_CLAMP;

then the sampling seems to be done using Nearest and not Linear.

I have attached the driver Info for my configuration below.

The same code works fine on the Nvidia card that I have.

Laurent

As a side note: I can't test it on the CPU because the CPU OpenCL implementation does not support CL_UNSIGNED_INT8 (which I can't understand...come on, how hard is it to provide a CPU implementation even non optimal for all modes). I could understand have a limited set of format supported on the HD4000 but having even less on the CPU is not really an option. Intel should at least make the effort to support the same amount of format between the HD4000 and the CPU. And I am not asking to support as many format as Nvidia does because that would probably be too much to ask. See list below to understand how poor the Intel CPU implementation feels.

NVidia OpenCL implementation.

Supported 3D Image formats

CL_R CL_FLOAT CL_R CL_HALF_FLOAT CL_R CL_UNORM_INT8 CL_R CL_UNORM_INT16 CL_R CL_SNORM_INT16 CL_R CL_SIGNED_INT8 CL_R CL_SIGNED_INT16 CL_R CL_SIGNED_INT32 CL_R CL_UNSIGNED_INT8 CL_R CL_UNSIGNED_INT16 CL_R CL_UNSIGNED_INT32 CL_A CL_FLOAT CL_A CL_HALF_FLOAT CL_A CL_UNORM_INT8 CL_A CL_UNORM_INT16 CL_A CL_SNORM_INT16 CL_A CL_SIGNED_INT8 CL_A CL_SIGNED_INT16 CL_A CL_SIGNED_INT32 CL_A CL_UNSIGNED_INT8 CL_A CL_UNSIGNED_INT16 CL_A CL_UNSIGNED_INT32 CL_RG CL_FLOAT CL_RG CL_HALF_FLOAT CL_RG CL_UNORM_INT8 CL_RG CL_UNORM_INT16 CL_RG CL_SNORM_INT16 CL_RG CL_SIGNED_INT8 CL_RG CL_SIGNED_INT16 CL_RG CL_SIGNED_INT32 CL_RG CL_UNSIGNED_INT8 CL_RG CL_UNSIGNED_INT16 CL_RG CL_UNSIGNED_INT32 CL_RA CL_FLOAT CL_RA CL_HALF_FLOAT CL_RA CL_UNORM_INT8 CL_RA CL_UNORM_INT16 CL_RA CL_SNORM_INT16 CL_RA CL_SIGNED_INT8 CL_RA CL_SIGNED_INT16 CL_RA CL_SIGNED_INT32 CL_RA CL_UNSIGNED_INT8 CL_RA CL_UNSIGNED_INT16 CL_RA CL_UNSIGNED_INT32 CL_RGBA CL_FLOAT CL_RGBA CL_HALF_FLOAT CL_RGBA CL_UNORM_INT8 CL_RGBA CL_UNORM_INT16 CL_RGBA CL_SNORM_INT16 CL_RGBA CL_SIGNED_INT8 CL_RGBA CL_SIGNED_INT16 CL_RGBA CL_SIGNED_INT32 CL_RGBA CL_UNSIGNED_INT8 CL_RGBA CL_UNSIGNED_INT16 CL_RGBA CL_UNSIGNED_INT32 CL_BGRA CL_UNORM_INT8 CL_BGRA CL_SIGNED_INT8 CL_BGRA CL_UNSIGNED_INT8 CL_ARGB CL_UNORM_INT8 CL_ARGB CL_SIGNED_INT8 CL_ARGB CL_UNSIGNED_INT8 INTENSITY CL_FLOAT INTENSITY CL_HALF_FLOAT INTENSITY CL_UNORM_INT8 INTENSITY CL_UNORM_INT16 INTENSITY CL_SNORM_INT16 CL_LUMINANCE CL_FLOAT CL_LUMINANCE CL_HALF_FLOAT CL_LUMINANCE CL_UNORM_INT8 CL_LUMINANCE CL_UNORM_INT16 CL_LUMINANCE CL_SNORM_INT16

Intel CPU OpenCL Implementation

Supported 3D Image formats

CL_RGBA CL_UNORM_INT8 CL_RGBA CL_UNORM_INT16 CL_RGBA CL_SIGNED_INT8 CL_RGBA CL_SIGNED_INT16 CL_RGBA CL_SIGNED_INT32 CL_RGBA CL_UNSIGNED_INT8 CL_RGBA CL_UNSIGNED_INT16 CL_RGBA CL_UNSIGNED_INT32 CL_RGBA CL_HALF_FLOAT CL_RGBA CL_FLOAT CL_BGRA CL_UNORM_INT8 INTENSITY CL_FLOAT CL_LUMINANCE CL_FLOAT

Intel HD4000 GPU OpenCLImplementation

Supported 3D Image formats

CL_RGBA CL_UNORM_INT8 CL_RGBA CL_UNORM_INT16 CL_RGBA CL_SIGNED_INT8 CL_RGBA CL_SIGNED_INT16 CL_RGBA CL_SIGNED_INT32 CL_RGBA CL_UNSIGNED_INT8 CL_RGBA CL_UNSIGNED_INT16 CL_RGBA CL_UNSIGNED_INT32 CL_RGBA CL_HALF_FLOAT CL_RGBA CL_FLOAT CL_BGRA CL_UNORM_INT8 CL_R CL_FLOAT CL_R CL_UNORM_INT8 CL_R CL_UNORM_INT16 CL_R CL_SIGNED_INT8 CL_R CL_SIGNED_INT16 CL_R CL_SIGNED_INT32 CL_R CL_UNSIGNED_INT8 CL_R CL_UNSIGNED_INT16 CL_R CL_UNSIGNED_INT32 CL_R CL_HALF_FLOAT CL_A CL_UNORM_INT8 CL_RG CL_UNORM_INT8 CL_RG CL_UNORM_INT16 CL_RG CL_SIGNED_INT8 CL_RG CL_SIGNED_INT16 CL_RG CL_SIGNED_INT32 CL_RG CL_UNSIGNED_INT8 CL_RG CL_UNSIGNED_INT16 CL_RG CL_UNSIGNED_INT32 CL_RG CL_HALF_FLOAT CL_RG CL_FLOAT

Jeffrey_M_Intel1 · ‎01-09-2013

Would it be possible for you to post your simple test program? That would make it easier to ensure we're replicating exactly what you're seeing. Thanks! Jeff

LLess · ‎01-10-2013

Well it won't be possible due to IP issues but I might be able to write a shorter version with fake data that can probably show the same issue. I will look into that next week. Thanks. Laurent.

LLess · ‎01-15-2013

Ok my issue is the same as the following topics which has already been answered but for 3D.

http://software.intel.com/en-us/forums/topic/328510

Honestly saying that basically you have to implement the filtering in the kernel because the OpenCL standard does not define the behavior is really poor work.

Specially when the expected output is not what a normal developper would expect.

"Texture Coords -> Unnormalized if necessary-> Apply any necessary clamping -> Apply Filtering " doesn't seem too tricky to specify specially with Clamping and filtering unrelated...

Basically turning ON clamping and getting Nearest behavior when Linear Filtering is ON is unexpected to say the least...

If OpenCL wants to succeed as a standard then there should be very little places where such undefined behavior is allowed. As a promoter member of the OpenCL group Intel should try to address such issues in a future release.

Damn my company is also a member, I guess I should try to contact them too :)

Best Regards.

ARNON_P_Intel · ‎01-16-2013

Thanks for commeting on the need for image formats. Best is if you can tell us exactly which image formats are actually missing for your implementation on both Intel CPU and Intel HD Graphics, and we will see how can best serve your needs.

Regards,

Arnon

LLess · ‎01-16-2013

Hi Arnon,

Well I actually have what I need on the HD4000 (for now).

But on the CPU side there is no the single channels (I have not checked the latest 1.2 beta driver though ) CL_R or CL_A which are present on the HD4000 are not there.

CL_R CL_FLOAT
CL_R CL_UNORM_INT8
CL_R CL_UNORM_INT16
CL_R CL_SIGNED_INT8
CL_R CL_SIGNED_INT16
CL_R CL_SIGNED_INT32
CL_R CL_UNSIGNED_INT8
CL_R CL_UNSIGNED_INT16
CL_R CL_UNSIGNED_INT32

These are probably the main ones. I am sure that there are cases where I would like a 3 channel but I can use CL_RGBA for that (and they are present on both HD4000 and CPU ).

Thanks.

Laurent.

LLess · ‎01-16-2013

Actually ideally you want the CPU to implement anything that is available on the HD4000 and vice versa.

That way any kernel can use a mixed environment. At the moment since the CPU and GPU implementations are showing different sets of texture format supported, you can't validate a kernel on CPU then move it to GPU because it might just fail to create the textures..

Laurent

HD4000 image3d sampling issue.