Hi James and thanks for your

James_R_ · ‎06-04-2014

The current process to acquire an OpenCL binary looks something like this:
program = clCreateProgramWithSource(...)
clGetProgramInfo( program, CL_PROGRAM_BINARIES,..., binary )

This binary is then stored somewhere and loaded at application runtime by calling clCreateProgramFromBinary.

However, this is far too slow since the LLVM compiler takes several minutes to generate the machine code from the LLVM bitcode "binary" at runtime. (I have a bunch of kernels)

Is there any way to get clCreateProgramWithSource, clGetProgramInfo, or any other method to output a real machine code binary from an OpenCL program?

I realize the OpenCL spec got it wrong when clGetProgramInfo says "The bits returned can be an implementation-specific intermediate representation (a.k.a. IR) or device specific executable bits or both. The decision on which information is returned in the binary is up to the OpenCL implementation." but it seems like Intel didn't try to fix their mistake.

Since I'm pre-compiling and running on the same machine, it would seem reasonable to include machine code for the architectures that are actually available in addition to whatever intermediate representation Intel feels like using. How often are people cross-compiling on disparate platforms? Even so, the backup IR should work. In fact, cross-compilation doesn't even make sense here because the target platform would need to have the LLVM bitcode compiler.

Arik_N_Intel · ‎06-05-2014

Hi James and thanks for your question!

The problem is well understood.

In our latest OpenCL runtime, the default kernel binary is actually the very final x86 code (Currently only on the CPU). You can download the runtime at: https://software.intel.com/en-us/articles/opencl-drivers

The matching OpenCL SDK is available here: https://software.intel.com/en-us/vcsource/tools/opencl-sdk

This new feature is documented in the user guide(search for "Final Kernel"): https://software.intel.com/en-us/node/515196

I hope this helps. Please let me know otherwise.

Arik

James_R_ · ‎06-05-2014

This doesn't help...I should have given more context. I'm using Xeon Phi Accelerators in an HPC environment. So based on this Supported Features Summary, the answer is NO.

When can we expect real support?

I'm disappointed a support document like this has to exist. There are far too many optional features in the OpenCL standard.

Arik_N_Intel · ‎06-11-2014

Thanks James. I really appreciate your inputs.

We are working towards supporting kernel JIT save/load also on Xeon Phi. The feature should be available in a future release.

regards, Arik

LLVM bitcode and machine code