- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Today taking a break from the HD4000, I decided to give the 2013 CPU SDK a try.
The latest version is reporting supporting the format I needed to I switch to CPU.
I am creating a volume with 8bit CL_R, CL_UNSIGNED_INT8 format.
Here is my code sampling the volume.
const sampler_t volume_trilinear_sampler = CLK_FILTER_LINEAR | CLK_NORMALIZED_COORDS_TRUE | CLK_ADDRESS_CLAMP_TO_EDGE;
inline float SampleRaw(__read_only image3d_t volume, float4 position)
{
if ( position.x < 0 || position.y < 0 || position.z < 0 )
return -1;
if ( (int)position.x >= 1.0f || (int)position.y >= 1.0f || (int)position.z >= 1.0f )
return -1;
return read_imagef(volume,volume_trilinear_sampler,position).x;
}
Everything is built fine but when I run the kernel it crashes.
There are no error when creating the context, textures or kernel. It just crashes.
Now this produces some assembly code which seems to indicate that the read_imagef function is NULL.
EAX = 00000000 EBX = 21450600 ECX = 00000000 EDX = 00000000 ESI = 00000000 EDI = 00000000 EIP = 70040710 ESP = 24F542C0 EBP = 24F544FC EFL = 00000000
70040641 C5 E2 59 5E 58 vmulss xmm3,xmm3,dword ptr [esi+58h]
70040646 C5 E2 58 D2 vaddss xmm2,xmm3,xmm2
7004064A C5 E0 57 DB vxorps xmm3,xmm3,xmm3
7004064E C5 F8 2E D9 vucomiss xmm3,xmm1 //Test if x < 0.0
70040652 0F 87 81 01 00 00 ja 700407D9
70040658 C5 F8 2E D8 vucomiss xmm3,xmm0 //Test if y < 0.0
7004065C 0F 87 77 01 00 00 ja 700407D9
70040662 C5 F8 2E DA vucomiss xmm3,xmm2 //Test if z < 0.0
70040666 0F 87 6D 01 00 00 ja 700407D9
7004066C C5 FA 2C C1 vcvttss2si eax,xmm1
70040670 C5 FA 2A D8 vcvtsi2ss xmm3,xmm0,eax
70040674 C5 F8 2E 1D 08 00 05 70 vucomiss xmm3,dword ptr ds:[70050008h] //Test if > 1.0
7004067C 0F 83 57 01 00 00 jae 700407D9
70040682 C5 FA 2C C0 vcvttss2si eax,xmm0
70040686 C5 FA 2A D8 vcvtsi2ss xmm3,xmm0,eax
7004068A C5 F8 2E 1D 08 00 05 70 vucomiss xmm3,dword ptr ds:[70050008h] //Test if > 1.0
70040692 0F 83 41 01 00 00 jae 700407D9
70040698 C5 FA 2C C2 vcvttss2si eax,xmm2
7004069C C5 FA 2A D8 vcvtsi2ss xmm3,xmm0,eax
700406A0 C5 F8 2E 1D 08 00 05 70 vucomiss xmm3,dword ptr ds:[70050008h] //Test if > 1.0
700406A8 0F 83 2B 01 00 00 jae 700407D9
700406AE C4 E3 71 21 C0 10 vinsertps xmm0,xmm1,xmm0,10h
700406B4 C4 E3 79 21 C2 20 vinsertps xmm0,xmm0,xmm2,20h
700406BA 8B 9C 24 F4 00 00 00 mov ebx,dword ptr [esp+0F4h]
700406C1 8B 73 14 mov esi,dword ptr [ebx+14h]
700406C4 8B BB 04 01 00 00 mov edi,dword ptr [ebx+104h]
700406CA 8D 84 24 E0 01 00 00 lea eax,[esp+1E0h]
700406D1 89 44 24 08 mov dword ptr [esp+8],eax
700406D5 8D 84 24 F0 01 00 00 lea eax,[esp+1F0h]
700406DC 89 44 24 04 mov dword ptr [esp+4],eax
700406E0 89 1C 24 mov dword ptr [esp],ebx
700406E3 C5 F8 54 05 60 00 05 70 vandps xmm0,xmm0,xmmword ptr ds:[70050060h]
700406EB FF 93 84 00 00 00 call dword ptr [ebx+84h]
700406F1 C5 F8 28 D0 vmovaps xmm2,xmm0
700406F5 C5 F8 28 8C 24 E0 01 00 00 vmovaps xmm1,xmmword ptr [esp+1E0h]
700406FE C5 F8 28 84 24 F0 01 00 00 vmovaps xmm0,xmmword ptr [esp+1F0h]
70040707 89 74 24 04 mov dword ptr [esp+4],esi
7004070B 89 1C 24 mov dword ptr [esp],ebx
7004070E FF D7 call edi
Cheers.
Laurent
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Laurent,
Can you also provide your host code - enough for us to reproduce the issue?
Thanks,
Raghu
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Raghu,
I think that this is going to be tricky. I might be able to provide the kernel only though as long as I send it in a PM.
Laurent.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Actually using read_imagef with CL_UNSIGNED_INT8 should result in an undefined behavior (great) according to the OpenCL doc.
I guess in that case it was a crash :)
Moving the texture to UNORM8 fixes the crash.
And it is actually the only way to get the linear filtering too since using UNSIGNED_INT8 forces read_imageui and that doesn't support linear filtering as far as I can see in the OpenCL documentation.
I really feel that the read_image functions definitions have been messed up badly by the OpenCL commity. Tons of undefined behavior and unsupported modes, not great at all for something that wants to be a standard.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Laurent. We will evaluate the value of this as a future feature request moving forward.
- Chuck

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page