JIT compilation + how to deploy

Sharpe__Brian · ‎05-19-2021

Hi Intel

Can we do runtime compiling of DPC++ code?
eg like CUDA/NVRTC where you can compose a chunk of C++ code at runtime into a string, and send it to something (eg a driver) for compilation + execution.

If this is possible, then I'd like to know how I can deploy such an application to a users machine.
Like... what would the user need to have installed on their machine to make it work?
I'm assuming simply an OpenCL driver or something, but I'm hoping not much else. I'm hoping I can simply ship a few DPC++ DLLs with our product so that the user is not required to install anything else.

Thanks very much

RahulV_intel · ‎05-21-2021

Hi,

DPC++ supports both JIT as well as AOT compilation.

Kindly refer to the following links for more details:

https://software.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/programming-interface/compilation-flow-overview.html

https://link.springer.com/book/10.1007%2F978-1-4842-5574-2 (Chapter 13 of the DPC++ book).

JIT compilation doesn't require any additional drivers. However, make sure that you have the GPU drivers installed if you're targeting an Intel GPU. (https://dgpu-docs.intel.com/installation-guides/index.html)

Thanks,

Rahul

Sharpe__Brian · ‎05-23-2021

Hi Rahul

Thanks a lot for the reply, but unfortunately that has not answered my question.
I have perhaps confused things by asking for "jit-compilation".
A better way to describe what I'm after is "runtime-compilation"

So from reading the documents you suggested, it seems there are two clear ways to perform compilation with DPC++ (regarding only C++ code which is intended to execute on the device).

1) Ahead-of-time compilation

So the steps are like this for device code:
- at compiletime the C++ source is compiled to SPIRV, then to device code.
- at runtime the device code is executed.

2) Just-in-time compilation

device code steps:
- at compiletime the C++ source is compiled to SPIRV
- at runtime the SPIRV code is compiled to device code, and executed.

What I'm after is a 3rd type of mode
ie
3) runtime-compilation

device code steps:
- nothing happens at compile-time
- at runtime the C++ source is compiled (somehow) to device code, then executed.

examples of such systems
- NVRTC (nvidias runtime compiler)
- OpenGL/GLSL
- OpenCL
- etc...

Is this kind of model supported by DPC++ at all?
And if so how? what would need to be installed on the users machine to make this work?

Thank you very much
Brian Sharpe

Sharpe__Brian · ‎05-23-2021

Oh, and to be clear, the use-case is executing shader graphs.
So Imagine the user at runtime can construct an arbitrary node-graph representing a large computation.
I would like to translate that node-graph into C++ sourcecode at runtime, and then execute on the device.

Thank you

RahulV_intel · ‎05-27-2021

Hi,

Kindly refer to the Online compiling and Linking section of the SYCL 2020 specs:

https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:bundles.compile-link

If you have a test case that doesn't comply with the specs above, kindly share it with us for further investigation.

Thanks,

Rahul

Sharpe__Brian · ‎06-01-2021

Hi Rahul

The "Online compiling and linking" feature looks great. Its exactly the style of compilation that I was looking for (ie #3 mentioned above).

Looking at the DPC++ sourcecode, and other DPC++ docs, I can see that only two languages are supported at the moment

https://intel.github.io/llvm-docs/doxygen/online__compiler_8hpp_source.html
(cl::sycl::INTEL::source_language enum)

https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/OnlineCompilation/OnlineCompilation.asciidoc
- opencl_c, // OpenCL C language
- cm // Intel's C-for-Media language

Both are c-style languages.
So getting back to the original question, it seems that DPC++ does not provide any way to perform online compiling of C++ code?
Am I correct with this conclusion? Do you know if this might be planned for some future release instead?

thanks a lot!
Brian

Subarnarek_G_Intel · ‎06-07-2021

Hi Brian,

I will make a feature request to the developers but would like to know whether you are actually in a scenario which requires this urgently. In that case it would be easier for us to push the implementation faster.

Regards,

Subarna

Sharpe__Brian · ‎06-09-2021

Hi Subarna

Yes this is something we require very soon.
We are currently making key design decisions for how our product will function, and this feature could be a show-stopper for us using DPC++ or not. To be honest, it would be great to have the chance to speak to a developer directly sometime so we could get their thoughts on this feature, and what its possible solution could look like once implemented.

thanks a lot
Brian Sharpe

Subarnarek_G_Intel · ‎06-14-2021

Hi Brian,

I am not sure whether you can speak with the developers directly but what I can do is I can bring up all the doubts that you want clarity about to the team and get your queries answered.

Regards,

Subarna

Sharpe__Brian · ‎06-14-2021

Hi Subarna

Sure, whatever you can do for us would be great
So to summarize:

Our use-case:
- We write a GPU-based renderer that needs to evaluate user-created node graphs at runtime (ie shaders)
- Our codebase (as well as upcoming DPC++ based Embree I assume) is all in C++
- We generate C++ code from the node-graph at runtime, and need to compile and execute at runtime as well.
- Our application comes as a complete package, and needs to be able to install and run on a user's machine with minimal install steps (ie be self-contained).

What we need from DPC++:
- The "Online compiling and linking" feature looks ideal
- We require it to support C++ (currently only C)
- We require minimal installation steps, in order to deploy to a users machine (eg only our application, and a graphics driver installed. we do not want to require the user to install LLVM/Clang on their computer. But some extra DLLs shipped with our application would be fine)

ps: If you need an example, NVidia/Cuda already has this feature, which we are using just fine for NVidia-based GPUs. It's called NVRTC.

Does the developer foresee this kind of thing ever being part of DPC++?
And if so, what kind of timeframe do they think it might be? (eg 2021? 2022? never?)

thanks a lot
Brian Sharpe

Subarnarek_G_Intel · ‎07-14-2021

Hi Brian,

Conveyed all your requirements to the engineering. Will update you with their response shortly.

Sharpe__Brian · ‎07-15-2021

Thanks very much Subarna

Daniel_D · ‎08-31-2021

I'm interested in the same functionallity as well. Is there any update available already?

Thanks.

Subarnarek_G_Intel · ‎04-25-2022

Hi Brian,

We have identified a protentional gap of implementation of runtime compilation The Feature Request has been taken up by the Engineering team. Customers will be informed as soon as the feature is out!

Regards,

Subarna

Sharpe__Brian · ‎04-25-2022

Awesome!
thanks!

Subarnarek_G_Intel · ‎04-25-2022

This issue has been resolved and we will no longer respond to this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only