OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1719 Discussions

cl_ext_device_fission snippet

Vj_P
Beginner
602 Views
Hi,

I currently try to use cl_ext_device_fission extension in order to benchmark the scalability of my OpenCL code. I use Intel OpenCL SDK 1.1 64bits for Linux.
[cpp]cl_device_id device_id_part[1];
const cl_context_properties part_props[] = {
	CL_DEVICE_PARTITION_BY_COUNTS_EXT,
	1,
	CL_PARTITION_BY_COUNTS_LIST_END_EXT,
	CL_PROPERTIES_LIST_END_EXT };

CL_WRAPPER( clCreateSubDevicesEXT(
	device_id,	/* one device ID returned by clGetDeviceIDs() */
	part_props, 	/* partition scheme (1 partition of 1 compute unit) */
	1,		/* partition count */
	device_id_part,	/* new device ID */
	NULL) );[/cpp]
I get the following error: CL_DEVICE_PARTITION_FAILED_EXT
Is there any example of the correct usage of this extension in Intel SDK? How to get more information about the error?

Regards

VJ
0 Kudos
5 Replies
Doron_S_Intel
Employee
602 Views
Hi VJ,

Your understanding of the spec is accurate. There's an issue (documented in the release notes) where the num_entries param for the fission API is ignored, so the NULL in the second call to clCreateSubdevicesEXT should be a pointer to the correct size. In your case, modifying the code as such:
[cpp] size_t partition_count = 1;

 CL_WRAPPER( clCreateSubDevicesEXT(  
     device_id,  /* one device ID returned by clGetDeviceIDs() */  
     part_props,     /* partition scheme (1 partition of 1 compute unit) */  
     1,      /* partition count */  
     device_id_part, /* new device ID */  
     &partition_count) );[/cpp]
Should eliminate the issue. In cases where the amount of created sub-devices isn't known in advance (like EQUALLY or by NUMA), you would call clCreateSubdevicesEXT twice, the first time to query how many sub-devices and the second time to create them. In such a case, as long as the feature is in a preview state, you would again need to pass the pointer in the last argument to both calls.
We are, of course, always working to weed out such issues, and plan to fix this in future releases.

Thanks,
Doron
0 Kudos
Vj_P
Beginner
602 Views
I applied your advice, however my code still does not work. There is another issue later in execution. I attached the full source code of an example which does not work.
I get the following output at runtime:
[bash]Device 0:  max compute units = 8
Device partition 0: max compute units = 4
Error -11 executing clBuildProgram(program, 0, NULL, NULL, NULL, NULL) on example.c:91 (code = -11)
Aborted
[/bash]
The second line shows that compute device has been successfuly partitioned. According to cl.h, error code -11 means CL_BUILD_PROGRAM_FAILURE.

VJ
0 Kudos
Doron_S_Intel
Employee
602 Views
I'll look into it. Thanks.
0 Kudos
Vj_P
Beginner
602 Views
I found the problem. OpenCL context has to be created from sub-devices instead of root-device. I managed to run my example on Linux using Intel toolkit 1.1 (64bits) and CPU backend of AMD toolkit 2.4 (64bits). However, there is still an issue about printf extension and Intel toolkit (problem not related to device fission).

The source code can be found at the following address: http://kde.cs.tsukuba.ac.jp/~vjordan/wiki/index.php/OpenCL/ClExtDeviceFission

Thank you for your help.

VJ
0 Kudos
Doron_S_Intel
Employee
602 Views
Glad to hear the problem was solved. One comment I'd like to make, there's an unexpected property relating to contexts and device fission. In your example file, you create a context with clCreateContextFromType. According to the spec, such contexts do not reference sub-devices, regardless of whether said sub-devices were created before or after context creation.
Contexts created with clCreateContext don't have that limitation, and reference devices that were provided during creation time, as well as sub-devices of devices passed to them that were created after context creation.
The current implementation of Device Fission doesn't support the last part, meaning with the Intel OpenCL SDK, you have to first of all create sub-devices, then pass an explicit list of IDs to the context during clCreateContext.
Please let us know if you encounter any other issues or interesting observations when using the device fission extension.

Thanks,
Doron Singer
0 Kudos
Reply