Intel® oneAPI Data Parallel C++
Support for Intel® oneAPI DPC++ Compiler, Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and GDB*
544 Discussions

Is it possible to mix JIT and AOT compilation for sycl devices

FantasticMrFox
Beginner
1,353 Views

Question on SO here: https://stackoverflow.com/questions/75406213/mix-jit-and-aot-compilation-for-sycl-devices

 

I have a program with a variety of kernels. In production these kernels run on a gpu device and require JIT (Just in time) compilation because we use specialisation constants. For testing we run on the CPU but we would like AOT (Ahead of time) compilation to save time when running the tests.

So we have a very simple executable:

#include <sycl/sycl.hpp>

int main()
{
    auto device = sycl::device{sycl::gpu_selector_v}; // Note that we are selecting the GPU here!
    auto queue  = sycl::queue{device};

    queue
        .submit(
            [](sycl::handler& cgh)
            {
                sycl::stream out(1024, 256, cgh);
                cgh.parallel_for<class HELLO_WORLD>(
                    sycl::range<1>{5},
                    [=](sycl::id<1> id) { out << "Hello #" << id.get(0) << "\n"; }
                );
            }
        )
        .wait();

    return 0;
}

That is built through cmake with:

set(CMAKE_CXX_COMPILER "icpx")
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED TRUE)

set(SYCL_COMPILER_FLAGS "-fclang-abi-compat=7 -fsycl -sycl-std=2020 -fp-model=precise")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SYCL_COMPILER_FLAGS}")

set(SYCL_LINK_FLAGS "-fsycl ")
set(CMAKE_CXX_LINK_FLAGS "${CMAKE_CXX_LINK_FLAGS} ${SYCL_LINK_FLAGS}")

add_executable(mix_jit_aot 
    examples/mix_jit_aot.cpp
)

This compiles and runs just fine on the device:

[opencl:gpu:2] Intel(R) OpenCL HD Graphics, Intel(R) UHD Graphics [0x9bc4] 3.0 [22.28.23726.1]

However, if we add AOT (Ahead of time) compilation for a different device, say a CPU:

    set(SYCL_AOT_COMPILE_FLAGS -fsycl-targets=spir64_x86_64)
    target_compile_options(mix_jit_aot PUBLIC
        ${SYCL_AOT_COMPILE_FLAGS}
    )
    
    set(SYCL_AOT_LINK_FLAGS ${SYCL_AOT_COMPILE_FLAGS} -Xsycl-target-backend=spir64_x86_64 "-march avx2")
    target_link_options(mix_jit_aot PUBLIC
        ${SYCL_AOT_LINK_FLAGS}
    )

It compiles, and will run if i set the selection of device to the CPU. (aka auto device = sycl::device{sycl::cpu_selector_v};) However, if i use the GPU,. it crashes with:

terminate called after throwing an instance of 'sycl::_V1::runtime_error'
  what():  Native API failed. Native API returns: -42 (PI_ERROR_INVALID_BINARY) -42 (PI_ERROR_INVALID_BINARY)
Aborted (core dumped)

Is it possible to compile AOT for a single device, but use JIT compilation for everything else?

0 Kudos
10 Replies
SeshaP_Intel
Moderator
1,293 Views

Hi,


Thank you for posting in Intel Communities.


>>Is it possible to compile AOT for a single device, but use JIT compilation for everything else?

You can try AOT without specifying the specific device selector in the source code. 


Could you please remove the sycl::gpu_selector_v in your source code and try the AOT compilation on different devices?

Please refer to the below link for more details.

https://www.intel.com/content/www/us/en/develop/documentation/oneapi-dpcpp-cpp-compiler-dev-guide-and-reference/top/compiler-reference/compiler-options/offload-openmp-and-parallel-processing-options/fsycl-targets.html


Thanks and Regards,

Pendyala Sesha Srinivas


0 Kudos
FantasticMrFox
Beginner
1,270 Views

Hi there,

 

It doesn't seem like you read my question at all. 

 

I know that i can use the AOT compilation for CPU. The issue is that if i specify the AOT compilation for CPU then i cannot select GPU without the code crashing. 

 

The question is:

 

Is it possible to compile AOT for a single device, but use JIT compilation for everything else?

0 Kudos
SeshaP_Intel
Moderator
1,231 Views

Hi,

 

We tried with the "default_selector_v" API in your source code and we were able to perform AOT on the CPU and JIT compilation on the GPU.

Please refer to the below screenshot for more details.

SeshaP_Intel_0-1676882842153.png

If this doesn't resolve your issue, could you please confirm that you want to use gpu_selector_v in the code and want to target spir64_x86_64 CPU while performing AOT?

Please confirm if this is the case, so that we can investigate this issue.

 

Thanks and Regards,

Pendyala Sesha Srinivas

0 Kudos
FantasticMrFox
Beginner
1,204 Views

Thank you for the clarification.

 

Yes, in our circumstance we need to explicitly select the gpu. Our situation is as follows:

1. The AOT compilation is used on a jenkins testing cluster where we explicitly select the CPU. We rely on being able to run the tests on the CPU to more closely check memory access and other things.

2. The JIT compilation is used on an embedded device where we explicitly select the GPU. If the GPU is not available then something is broken and we alert the user that the hardware is in trouble. 

 

In summary, we must be able to do gpu_selector_v.

0 Kudos
SeshaP_Intel
Moderator
1,164 Views

Hi,


We were able to reproduce your issue. We have informed the development team about it.

We will get back to you soon.


Thanks and Regards,

Pendyala Sesha Srinivas


0 Kudos
SeshaP_Intel
Moderator
1,093 Views

Hi,

 

Thanks for your patience.

The AOT of spir64_x86_64 could not run on a GPU device.

 

You can use the below command line option which generates two kinds of binaries. 

-fsycl-targets=spir64_x86_64,spir64

 

This option generates AOT for CPU devices and JIT for all kinds of devices.

But while using sycl::gpu_selector_v in the source code the application runs and finds out that the AOT binary is not matched with the selected GPU. So it only runs the JIT for the GPU.

 

Thanks and Regards,

Pendyala Sesha Srinivas

0 Kudos
SeshaP_Intel
Moderator
1,029 Views

Hi,


We haven't heard back from you. Could you please provide an update on your issue?


Thanks and Regards,

Pendyala Sesha Srinivas


0 Kudos
FantasticMrFox
Beginner
1,001 Views

I am still trying to figure  out if this is working. When i statically link a lib with pre-compiled binaries it still seems to select JIT for CPU for the kernels in the static lib. 

 

I will need to experiment some more to let you know. 

 

0 Kudos
SeshaP_Intel
Moderator
959 Views

Hi,


Do you have any updates on this issue? Is there anything we can investigate this issue from our end?


Thanks and Regards,

Pendyala Sesha Srinivas


0 Kudos
SeshaP_Intel
Moderator
933 Views

Hi,


We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question. 


Thanks and Regards,

Pendyala Sesha Srinivas


0 Kudos
Reply