OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1714 Discussions

Compiler fails when vectorizing loop with type conversion

overnite81
Beginner
934 Views

Hi to all.

I am having the following issue with the Intel CPU Runtime for OpenCL version 23.1.46319:

This code

 

 

		__kernel void logisticMap(__global const float *params,
		__global const float *initialValues, __global float *result, __global unsigned int *bits, const int numIters, const int k)
		{
			int i = get_global_id(0);
			int j = get_global_id(1);
			int ti = get_local_id(0);
			
			int n = get_global_size(0);
			int nt = get_local_size(0);
			int nb = n / nt;

			const float kpow = pown(10.0f, k);
			float r, s;
			unsigned int w;
			r = initialValues[i];
			s = params[i];
			int cnt = 0;
			w = 0;
//			__attribute__((opencl_unroll_hint))
			for (int ic = 0; ic < numIters; ic++)
			{
				r = s * r * (1.0f - r);
				float t = r * kpow;
				float u = t - floor(t);
				unsigned int x = (unsigned int) step(0.5f, u);
				x ^= 1;
				w |= x << (ic % 32);
				if (ic % 32 == 31)
				{
					const int index = i * (numIters / 32) + cnt++;
					bits[index] = w;
					w = 0;
				}
			}
			result[i] = r;

		}

 

 

outputs

**Internal compiler error** Cannot select: 0x11c67196648: v4f32 = X86ISD::VRNDSCALE contract 0x11c67196568, TargetConstant:i32<1>
0x11c67196568: v4f32 = fmul 0x11c671969c8, 0x11c67195ed8
0x11c671969c8: v4f32,ch = CopyFromReg 0x11c5e7119a8, Register:v4f32 %12
0x11c67196108: v4f32 = Register %12
0x11c67195ed8: v4f32 = fmul 0x11c6719a1f8, 0x11c6719d878
0x11c6719a1f8: v4f32 = fmul 0x11c67195b58, 0x11c6719a498
0x11c67195b58: v4f32,ch = CopyFromReg 0x11c5e7119a8, Register:v4f32 %19
0x11c6719ddb8: v4f32 = Register %19
0x11c6719a498: v4f32,ch = CopyFromReg 0x11c5e7119a8, Register:v4f32 %20
0x11c6719db88: v4f32 = Register %20
0x11c6719d878: v4f32 = fsub 0x11c6719d9c8, 0x11c6719a498
0x11c6719d9c8: v4f32,ch = load<(load (s128) from constant-pool)> 0x11c5e7119a8, 0x11c67195e68, undef:i64
0x11c67195e68: i64 = X86ISD::Wrapper TargetConstantPool:i64<> 0
0x11c67196958: i64 = TargetConstantPool<> 0
0x11c6719daa8: i64 = undef
0x11c6719a498: v4f32,ch = CopyFromReg 0x11c5e7119a8, Register:v4f32 %20
0x11c6719db88: v4f32 = Register %20
0x11c67199fc8: i32 = TargetConstant<1>
In function: logisticMap

after attempt to compile.

Labels (1)
0 Kudos
2 Replies
cw_intel
Moderator
817 Views

Hi,

Can you provide the command you used to compile the kernel code?

BTW, ​have you used other versions of OpenCL CPU RT before? If so, please tell us the OpenCL CPU RT version that can work successfully.


Thanks


0 Kudos
cw_intel
Moderator
725 Views

Hi,


We haven't heard back from you for a long time so we are assuming that you have found a solution on your own and we will no longer respond to this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only.


Thanks,

Chen



0 Kudos
Reply