OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1663 Discussions

OpenCL KernelBuilder 2013 crashes in processor graphics (HD4000)

Leslie_S_
Beginner
171 Views

Hi guys,

I tried compile my own opencl kernel but KernelBuilder (32bit and 64bit too) says this log:

fc1 build 1 succeded.

fc1 build 2 succeeded.

Error: Internal Error.

And when the compiling is in-progress the ioc32 process's memory usage is almost 2 GB and the ioc64 has more than 5GB.
The same program works fine on the CPU; it only fails in the GPU. Any pointers on where to start? Thanks much (I am using OpenCL 2013 on a Windows 7 Pro with Intel HD Driver ver. 9.18.10.3071 ).

Best regards,
Leslie

0 Kudos
5 Replies
Leslie_S_
Beginner
171 Views

And here is the sample code

Raghupathi_M_Intel
171 Views

Thanks for the source. I am able to reproduce the internal error. I will take a look and get back to you.

Raghu

Raghupathi_M_Intel
171 Views

Hi,

We debugged this further and it appears that the program generates a branch greater than 2^15 instructions which cannot be encoded in the current gen architecture. A simple workaround would be to rewrite the function miller_rabin_32 like this:

<code>
bool miller_rabin_32(long n)
{
    bool result = false;
     if (n <= 1L) result = false;
     else if (n == 2L) result = true;
     else if (miller_rabin_pass_32( 2L, n) &&
             (n <= 7L  || miller_rabin_pass_32( 7L, n)) &&
             (n <= 61L || miller_rabin_pass_32(61L, n)))
        result = true;
        return result;
}
</code>

Hope this helps.

Leslie_S_
Beginner
171 Views

Unfortunately it seems not to be a good solution. I changed that method what you suggested but I get internal error yet.

Raghupathi_M_Intel
171 Views

Sorry for the delay in responding. Actually looks like there is a bug in KernelBuilder that is responsible for the failure. You still need to remove the early returns and rewrite your kernel like I suggested below. Build your program (dont use KernelBuilder). I tried this and it seems to work.

Raghu

Reply