I'm trying to write an opencl program using the device fission extension.
I'm using an Intel i3 M350 (4 Compute Units), but I'm not able to create sub devices:
[cpp]#define USE_CL_DEVICE_FISSION 1
The output from above:
Platform: Intel OpenCL Version: OpenCL 1.1 LINUX Device: Intel Core i3 CPU M 350 @ 2.27GHz Profile: FULL_PROFILE Driver: 1.1 ComputeUnits: 4 Device Fission: Available Error: -1057
This results in a -1057 (CL_DEVICE_PARTITION_FAILED_EXT). I've got advised that there might be an issue with the C++ Wrapper not being suitable to call clCreateSubDevicesEXT, because it's failing to pass a non-NULL parameter for the ret_num_devices (the count of how many sub devices are to be created).
So I wrote the same down in C. The relevant part to create the sub devices looks like this:[cpp]const cl_device_partition_property_ext device_partition_props =
cl_device_id* subDevices = NULL;
cl_uint n = 4;
subDevices = (cl_device_id*)malloc(4 * sizeof(cl_device_id));
ret = clCreateSubDevicesEXT(devices, device_partition_props, 4, subDevices, &n);
Now I'm constantly seeing a
[appname]: symbol lookup error: [appname]: undefined symbol: clCreateSubDevicesEXT
What could be the reason? I've got the Intel OpenCL SDK 1.5 and nvidia Toolkit 4.0 on my machine.
To debug this issue, can you please try linking your application to intelocl.dll instead of OpenCL.dll and see whether the problem disappears?
When I link against intelocl.so (and the subsequent dependencies), I can create the subdevices successfully. What does it mean? It's not working via ICD?
a) Re-installing the SDK, as it comes bundled with a late enough version of the ICD that should be aware of the device fission EXT extension.
b) Since that C++ wrapper worked for you, it probably uses clGetExtensionFunctionAddress under the hood, so you could mimic that in your C code.
b) I can successfully use the C API to create the sub devices and then wrap the resulting sub devices back to cl::Device (C++ Wrapper).
But this only works when linking against intelocl.so. However I need the ICD in order to have my gpu implementation running (as I want to have two contexts in my prorgam, one for the GPU, one for the CPU).
Just for me to understand: Every vendor is providing their own ICD in order to be addressable over one shared lib, is that correct? But what vendor (in the case of two platforms/implementations of OpenCL installed) is responsible for actually putting a libOpenCL.so into /usr/lib (or whatever the os specific path is)?
However, it's possible for some reason the version supplied by other SDKs is missing that entry point, which could cause the symptoms you're describing. Re-installing the SDK should overwrite whatever version of libOpenCL you have with the one bundled with the SDK, which certainly has those entry points.
When I link my project against that libOpenCL.so, it gets compiled. When I run it with ldd, I can see that it gets pointed to /usr/lib/nvidia-current/libOpenCL.so... When I bend the LD_LIBRARY_PATH to /usr/lib however, I'm getting that:
Error: API mismatch: the NVIDIA kernel module has version 295.20,
but this NVIDIA driver component has version 270.41.19. Please make
sure that the kernel module and all NVIDIA driver components
have the same version.
So is the Intel installation really installing libOpenCL to /usr/lib? I don't know. So this whole issue might be more of an nvidia problem, maybe? I'm afraid I'm missing something of the background knowledge here.
Btw, I can confirm that the behaviour of the ICD on other extensions is okay, as I can use the printf-extension within my project while linking to libOpenCL.so.