OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1686 Discussions

LLVM bitcode and machine code


The current process to acquire an OpenCL binary looks something like this:
program = clCreateProgramWithSource(...)
clGetProgramInfo( program, CL_PROGRAM_BINARIES,..., binary )

This binary is then stored somewhere and loaded at application runtime by calling clCreateProgramFromBinary.

However, this is far too slow since the LLVM compiler takes several minutes to generate the machine code from the LLVM bitcode "binary" at runtime.  (I have a bunch of kernels)

Is there any way to get clCreateProgramWithSource, clGetProgramInfo, or any other method to output a real machine code binary from an OpenCL program?

I realize the OpenCL spec got it wrong when clGetProgramInfo says "The bits returned can be an implementation-specific intermediate representation (a.k.a. IR) or device specific executable bits or both. The decision on which information is returned in the binary is up to the OpenCL implementation." but it seems like Intel didn't try to fix their mistake.

Since I'm pre-compiling and running on the same machine, it would seem reasonable to include machine code for the architectures that are actually available in addition to whatever intermediate representation Intel feels like using.  How often are people cross-compiling on disparate platforms?  Even so, the backup IR should work.  In fact, cross-compilation doesn't even make sense here because the target platform would need to have the LLVM bitcode compiler.

0 Kudos
3 Replies

Hi James and thanks for your question!

The problem is well understood.

In our latest OpenCL runtime, the default kernel binary is actually the very final x86 code (Currently only on the CPU). You can download the runtime at:

The matching OpenCL SDK is available here:

This new feature is documented in the user guide(search for "Final Kernel"):

I hope this helps. Please let me know otherwise.



This doesn't help...I should have given more context.  I'm using Xeon Phi Accelerators in an HPC environment.  So based on this Supported Features Summary, the answer is NO.

When can we expect real support?

I'm disappointed a support document like this has to exist.  There are far too many optional features in the OpenCL standard.


Thanks James. I really appreciate your inputs.

We are working towards supporting kernel JIT save/load also on Xeon Phi. The feature should be available in a future release.

regards, Arik