I have a program with a variety of kernels. In production these kernels run on a gpu device and require JIT (Just in time) compilation because we use specialisation constants. For testing we run on the CPU but we would like AOT (Ahead of time) compilation to save time when running the tests.
So we have a very simple executable:
#include <sycl/sycl.hpp>
int main()
{
auto device = sycl::device{sycl::gpu_selector_v}; // Note that we are selecting the GPU here!
auto queue = sycl::queue{device};
queue
.submit(
[](sycl::handler& cgh)
{
sycl::stream out(1024, 256, cgh);
cgh.parallel_for<class HELLO_WORLD>(
sycl::range<1>{5},
[=](sycl::id<1> id) { out << "Hello #" << id.get(0) << "\n"; }
);
}
)
.wait();
return 0;
}
That is built through cmake with:
set(CMAKE_CXX_COMPILER "icpx")
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED TRUE)
set(SYCL_COMPILER_FLAGS "-fclang-abi-compat=7 -fsycl -sycl-std=2020 -fp-model=precise")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SYCL_COMPILER_FLAGS}")
set(SYCL_LINK_FLAGS "-fsycl ")
set(CMAKE_CXX_LINK_FLAGS "${CMAKE_CXX_LINK_FLAGS} ${SYCL_LINK_FLAGS}")
add_executable(mix_jit_aot
examples/mix_jit_aot.cpp
)
This compiles and runs just fine on the device:
[opencl:gpu:2] Intel(R) OpenCL HD Graphics, Intel(R) UHD Graphics [0x9bc4] 3.0 [22.28.23726.1]
However, if we add AOT (Ahead of time) compilation for a different device, say a CPU:
set(SYCL_AOT_COMPILE_FLAGS -fsycl-targets=spir64_x86_64)
target_compile_options(mix_jit_aot PUBLIC
${SYCL_AOT_COMPILE_FLAGS}
)
set(SYCL_AOT_LINK_FLAGS ${SYCL_AOT_COMPILE_FLAGS} -Xsycl-target-backend=spir64_x86_64 "-march avx2")
target_link_options(mix_jit_aot PUBLIC
${SYCL_AOT_LINK_FLAGS}
)
It compiles, and will run if i set the selection of device to the CPU. (aka auto device = sycl::device{sycl::cpu_selector_v};
) However, if i use the GPU,. it crashes with:
terminate called after throwing an instance of 'sycl::_V1::runtime_error'
what(): Native API failed. Native API returns: -42 (PI_ERROR_INVALID_BINARY) -42 (PI_ERROR_INVALID_BINARY)
Aborted (core dumped)
Is it possible to compile AOT for a single device, but use JIT compilation for everything else?