Device Fission Intel CPU with C++ Wrapper

R__Winkler · ‎03-01-2012

I'm trying to write an opencl program using the device fission extension.

I'm using an Intel i3 M350 (4 Compute Units), but I'm not able to create sub devices:

[cpp]#define USE_CL_DEVICE_FISSION 1 #include #include "CL/cl.hpp" using namespace std; int main(int argc, char* argv[]) { cl::Context context; std::vector<:PLATFORM> platforms; cl::Platform::get(&platforms); cl_context_properties properties[] = { CL_CONTEXT_PLATFORM, (cl_context_properties)(platforms[1])(), 0 }; context = cl::Context(CL_DEVICE_TYPE_CPU, properties); std::vector<:DEVICE> devices = context.getInfo(); cout << "Platform:\t" << platforms[1].getInfo() << endl; cout << "Version:\t" << platforms[1].getInfo() << endl; cout << "Device:\t\t" << devices[0].getInfo() << endl; cout << "Profile:\t" << devices[0].getInfo() << endl; cout << "Driver:\t\t" << devices[0].getInfo() << endl; cout << "ComputeUnits:\t" << devices[0].getInfo() << endl; if (devices[0].getInfo().find("cl_ext_device_fission") == std::string::npos) { cout << "No device fission support!" << endl; exit(-1); } else { cout << "Device Fission: Available" << endl; } const cl_device_partition_property_ext subDeviceProperties[] = { CL_DEVICE_PARTITION_EQUALLY_EXT, 1, CL_PROPERTIES_LIST_END_EXT, 0 }; std::vector<:DEVICE> subDevices; int err = devices[0].createSubDevices(subDeviceProperties, &subDevices); if (err != CL_SUCCESS) { cout << "\nError: " << err << endl; } }[/cpp]

The output from above:

Platform:   Intel OpenCL
Version:    OpenCL 1.1 LINUX
Device:     Intel Core i3 CPU       M 350  @ 2.27GHz
Profile:    FULL_PROFILE
Driver:     1.1
ComputeUnits:   4
Device Fission: Available

Error: -1057

This results in a -1057 (CL_DEVICE_PARTITION_FAILED_EXT). I've got advised that there might be an issue with the C++ Wrapper not being suitable to call clCreateSubDevicesEXT, because it's failing to pass a non-NULL parameter for the ret_num_devices (the count of how many sub devices are to be created).

So I wrote the same down in C. The relevant part to create the sub devices looks like this:[cpp]const cl_device_partition_property_ext device_partition_props[] = { CL_DEVICE_PARTITION_EQUALLY_EXT, 1, CL_PROPERTIES_LIST_END_EXT, 0 }; cl_device_id* subDevices = NULL; cl_uint n = 4; subDevices = (cl_device_id*)malloc(4 * sizeof(cl_device_id)); ret = clCreateSubDevicesEXT(devices[0], device_partition_props, 4, subDevices, &n);

[/cpp]

Now I'm constantly seeing a

[appname]: symbol lookup error: [appname]: undefined symbol: clCreateSubDevicesEXT

What could be the reason? I've got the Intel OpenCL SDK 1.5 and nvidia Toolkit 4.0 on my machine.

Doron_S_Intel · ‎03-01-2012

This is very strange. Am I correct in assuming you have more than one implementation of OpenCL installed on your machine?

To debug this issue, can you please try linking your application to intelocl.dll instead of OpenCL.dll and see whether the problem disappears?

Thanks,
Doron Singer

R__Winkler · ‎03-01-2012

Well I've got two platforms: Intel OpenCL 1.5 and nvidia's toolkit 4.0, both offering OpenCL 1.1. This is on a 64bit Linux using gcc/g++.

When I link against intelocl.so (and the subsequent dependencies), I can create the subdevices successfully. What does it mean? It's not working via ICD?

Doron_S_Intel · ‎03-01-2012

Right, so this sounds like an ICD problem. It's possible the one you have is out of date - the Intel OpenCL SDK does work with the ICD. So, you could try:
a) Re-installing the SDK, as it comes bundled with a late enough version of the ICD that should be aware of the device fission EXT extension.
b) Since that C++ wrapper worked for you, it probably uses clGetExtensionFunctionAddress under the hood, so you could mimic that in your C code.

Good luck.

R__Winkler · ‎03-01-2012

a) I checked the installed Intel SDK version, it was 1.5-15294, as it is available right now for linux from the website.
b) I can successfully use the C API to create the sub devices and then wrap the resulting sub devices back to cl::Device (C++ Wrapper).

But this only works when linking against intelocl.so. However I need the ICD in order to have my gpu implementation running (as I want to have two contexts in my prorgam, one for the GPU, one for the CPU).

Just for me to understand: Every vendor is providing their own ICD in order to be addressable over one shared lib, is that correct? But what vendor (in the case of two platforms/implementations of OpenCL installed) is responsible for actually putting a libOpenCL.so into /usr/lib (or whatever the os specific path is)?

Doron_S_Intel · ‎03-01-2012

Sorry, I'll elaborate. Theoretically libOpenCL itself should be uniform. Every vendor distributes their own version, but it shouldn't matter.
However, it's possible for some reason the version supplied by other SDKs is missing that entry point, which could cause the symptoms you're describing. Re-installing the SDK should overwrite whatever version of libOpenCL you have with the one bundled with the SDK, which certainly has those entry points.

R__Winkler · ‎03-02-2012

Hm, ok, as I said, I reinstalled it, so theoretically the libOpenCL.so in /usr/lib should be the one from the Intel installation.

When I link my project against that libOpenCL.so, it gets compiled. When I run it with ldd, I can see that it gets pointed to /usr/lib/nvidia-current/libOpenCL.so... When I bend the LD_LIBRARY_PATH to /usr/lib however, I'm getting that:

Error: API mismatch: the NVIDIA kernel module has version 295.20,
but this NVIDIA driver component has version 270.41.19. Please make
sure that the kernel module and all NVIDIA driver components
have the same version.

So is the Intel installation really installing libOpenCL to /usr/lib? I don't know. So this whole issue might be more of an nvidia problem, maybe? I'm afraid I'm missing something of the background knowledge here.

Btw, I can confirm that the behaviour of the ICD on other extensions is okay, as I can use the printf-extension within my project while linking to libOpenCL.so.

Jim_Vaughn · ‎03-02-2012

I agree that it is an nvidia problem because intel isn't going to know of two different driver models unless they come from the Nvidia parts. Also Version 270.41 is REALLY old like a year old so I am sure there is something wrong. Early laster year all kinds of things were messed up on linux using two different opencl implementations so you clearly have some old stuff in your system. Not sure the best way to cleanup nvidia drivers on linux so I can't be much help there.