Question 1

Question on SO here: https://stackoverflow.com/questions/75406213/mix-jit-and-aot-compilation-for-sycl-devices

I have a program with a variety of kernels. In production these kernels run on a gpu device and require JIT (Just in time) compilation because we use specialisation constants. For testing we run on the CPU but we would like AOT (Ahead of time) compilation to save time when running the tests.

So we have a very simple executable:

#include <sycl/sycl.hpp>

int main()
{
    auto device = sycl::device{sycl::gpu_selector_v}; // Note that we are selecting the GPU here!
    auto queue  = sycl::queue{device};

    queue
        .submit(
            [](sycl::handler& cgh)
            {
                sycl::stream out(1024, 256, cgh);
                cgh.parallel_for<class HELLO_WORLD>(
                    sycl::range<1>{5},
                    [=](sycl::id<1> id) { out << "Hello #" << id.get(0) << "\n"; }
                );
            }
        )
        .wait();

    return 0;
}

That is built through cmake with:

set(CMAKE_CXX_COMPILER "icpx")
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED TRUE)

set(SYCL_COMPILER_FLAGS "-fclang-abi-compat=7 -fsycl -sycl-std=2020 -fp-model=precise")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SYCL_COMPILER_FLAGS}")

set(SYCL_LINK_FLAGS "-fsycl ")
set(CMAKE_CXX_LINK_FLAGS "${CMAKE_CXX_LINK_FLAGS} ${SYCL_LINK_FLAGS}")

add_executable(mix_jit_aot 
    examples/mix_jit_aot.cpp
)

This compiles and runs just fine on the device:

[opencl:gpu:2] Intel(R) OpenCL HD Graphics, Intel(R) UHD Graphics [0x9bc4] 3.0 [22.28.23726.1]

However, if we add AOT (Ahead of time) compilation for a different device, say a CPU:

    set(SYCL_AOT_COMPILE_FLAGS -fsycl-targets=spir64_x86_64)
    target_compile_options(mix_jit_aot PUBLIC
        ${SYCL_AOT_COMPILE_FLAGS}
    )
    
    set(SYCL_AOT_LINK_FLAGS ${SYCL_AOT_COMPILE_FLAGS} -Xsycl-target-backend=spir64_x86_64 "-march avx2")
    target_link_options(mix_jit_aot PUBLIC
        ${SYCL_AOT_LINK_FLAGS}
    )

It compiles, and will run if i set the selection of device to the CPU. (aka auto device = sycl::device{sycl::cpu_selector_v};) However, if i use the GPU,. it crashes with:

terminate called after throwing an instance of 'sycl::_V1::runtime_error'
  what():  Native API failed. Native API returns: -42 (PI_ERROR_INVALID_BINARY) -42 (PI_ERROR_INVALID_BINARY)
Aborted (core dumped)

Is it possible to compile AOT for a single device, but use JIT compilation for everything else?

Question 2

Hi,

Thank you for posting in Intel Communities.

>>Is it possible to compile AOT for a single device, but use JIT compilation for everything else?

You can try AOT without specifying the specific device selector in the source code.

Could you please remove the sycl::gpu_selector_v in your source code and try the AOT compilation on different devices?

Please refer to the below link for more details.

https://www.intel.com/content/www/us/en/develop/documentation/oneapi-dpcpp-cpp-compiler-dev-guide-and-reference/top/compiler-reference/compiler-options/offload-openmp-and-parallel-processing-options/fsycl-targets.html

Thanks and Regards,

Pendyala Sesha Srinivas

Question 3

Hi there,

It doesn't seem like you read my question at all.

I know that i can use the AOT compilation for CPU. The issue is that if i specify the AOT compilation for CPU then i cannot select GPU without the code crashing.

The question is:

Is it possible to compile AOT for a single device, but use JIT compilation for everything else?

Question 4

Hi,

We tried with the "default_selector_v" API in your source code and we were able to perform AOT on the CPU and JIT compilation on the GPU.

Please refer to the below screenshot for more details.

If this doesn't resolve your issue, could you please confirm that you want to use gpu_selector_v in the code and want to target spir64_x86_64 CPU while performing AOT?

Please confirm if this is the case, so that we can investigate this issue.

Thanks and Regards,

Pendyala Sesha Srinivas

Question 5

Thank you for the clarification.

Yes, in our circumstance we need to explicitly select the gpu. Our situation is as follows:

1. The AOT compilation is used on a jenkins testing cluster where we explicitly select the CPU. We rely on being able to run the tests on the CPU to more closely check memory access and other things.

2. The JIT compilation is used on an embedded device where we explicitly select the GPU. If the GPU is not available then something is broken and we alert the user that the hardware is in trouble.

In summary, we must be able to do gpu_selector_v.

Question 6

Hi,

We were able to reproduce your issue. We have informed the development team about it.

We will get back to you soon.

Thanks and Regards,

Pendyala Sesha Srinivas

Question 7

Hi,

Thanks for your patience.

The AOT of spir64_x86_64 could not run on a GPU device.

You can use the below command line option which generates two kinds of binaries.

-fsycl-targets=spir64_x86_64,spir64

This option generates AOT for CPU devices and JIT for all kinds of devices.

But while using sycl::gpu_selector_v in the source code the application runs and finds out that the AOT binary is not matched with the selected GPU. So it only runs the JIT for the GPU.

Thanks and Regards,

Pendyala Sesha Srinivas

Question 8

Hi,

We haven't heard back from you. Could you please provide an update on your issue?

Thanks and Regards,

Pendyala Sesha Srinivas

Question 9

I am still trying to figure out if this is working. When i statically link a lib with pre-compiled binaries it still seems to select JIT for CPU for the kernels in the static lib.

I will need to experiment some more to let you know.

Question 10

Hi,

Do you have any updates on this issue? Is there anything we can investigate this issue from our end?

Thanks and Regards,

Pendyala Sesha Srinivas

Question 11

Hi,

We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.

Thanks and Regards,

Pendyala Sesha Srinivas

score 0 · Answer 1

1,571 Views

Hi,

Thank you for posting in Intel Communities.

>>Is it possible to compile AOT for a single device, but use JIT compilation for everything else?

You can try AOT without specifying the specific device selector in the source code.

Could you please remove the sycl::gpu_selector_v in your source code and try the AOT compilation on different devices?

Please refer to the below link for more details.

https://www.intel.com/content/www/us/en/develop/documentation/oneapi-dpcpp-cpp-compiler-dev-guide-and-reference/top/compiler-reference/compiler-options/offload-openmp-and-parallel-processing-options/fsycl-targets.html

Thanks and Regards,

Pendyala Sesha Srinivas

Copy link

score 0 · Answer 2

1,548 Views

Hi there,

It doesn't seem like you read my question at all.

I know that i can use the AOT compilation for CPU. The issue is that if i specify the AOT compilation for CPU then i cannot select GPU without the code crashing.

The question is:

Is it possible to compile AOT for a single device, but use JIT compilation for everything else?

Copy link

score 0 · Answer 3

1,509 Views

Hi,

We tried with the "default_selector_v" API in your source code and we were able to perform AOT on the CPU and JIT compilation on the GPU.

Please refer to the below screenshot for more details.

If this doesn't resolve your issue, could you please confirm that you want to use gpu_selector_v in the code and want to target spir64_x86_64 CPU while performing AOT?

Please confirm if this is the case, so that we can investigate this issue.

Thanks and Regards,

Pendyala Sesha Srinivas

Copy link

score 0 · Answer 4

1,482 Views

Thank you for the clarification.

Yes, in our circumstance we need to explicitly select the gpu. Our situation is as follows:

1. The AOT compilation is used on a jenkins testing cluster where we explicitly select the CPU. We rely on being able to run the tests on the CPU to more closely check memory access and other things.

2. The JIT compilation is used on an embedded device where we explicitly select the GPU. If the GPU is not available then something is broken and we alert the user that the hardware is in trouble.

In summary, we must be able to do gpu_selector_v.

Copy link

score 0 · Answer 5

1,442 Views

Hi,

We were able to reproduce your issue. We have informed the development team about it.

We will get back to you soon.

Thanks and Regards,

Pendyala Sesha Srinivas

Copy link

score 0 · Answer 6

1,371 Views

Hi,

Thanks for your patience.

The AOT of spir64_x86_64 could not run on a GPU device.

You can use the below command line option which generates two kinds of binaries.

-fsycl-targets=spir64_x86_64,spir64

This option generates AOT for CPU devices and JIT for all kinds of devices.

But while using sycl::gpu_selector_v in the source code the application runs and finds out that the AOT binary is not matched with the selected GPU. So it only runs the JIT for the GPU.

Thanks and Regards,

Pendyala Sesha Srinivas

Copy link

score 0 · Answer 7

1,307 Views

Hi,

We haven't heard back from you. Could you please provide an update on your issue?

Thanks and Regards,

Pendyala Sesha Srinivas

Copy link

score 0 · Answer 8

1,279 Views

I am still trying to figure out if this is working. When i statically link a lib with pre-compiled binaries it still seems to select JIT for CPU for the kernels in the static lib.

I will need to experiment some more to let you know.

Copy link

score 0 · Answer 9

1,237 Views

Hi,

Do you have any updates on this issue? Is there anything we can investigate this issue from our end?

Thanks and Regards,

Pendyala Sesha Srinivas

Copy link

score 0 · Answer 10

1,211 Views

Hi,

We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.

Thanks and Regards,

Pendyala Sesha Srinivas

Copy link

Is it possible to mix JIT and AOT compilation for sycl devices