OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1721 Discussions

Direct kernel execution feature request!

janez-makovsek
New Contributor I
772 Views
Hi!

Please allow the kernels to be called without clEnqueueNDRange, but simply directly through the function pointer. This would allow the following features:

1.) Zero overhead for the thread start/stop because kernel runs on the current app thread. This makes it possible to accelerate "short" kernels which may need to run on one thread.
2.) Allows custom threading to be implemented by the caller
3.) Allows Intel to Inject latest high performance instructions in to any language via Open CL interface while keeping the dll style calling approach. People write performance sensitive code once in Open CL and for the next generation of CPU Intel simply releases a driver update.
4.) Makes the kernels debuggable with the full range of Intel debugging tools.
5.) Provides method to properly debug any Open CL code.

Thanks!
Atmapuri
0 Kudos
4 Replies
Yariv_A_Intel2
Employee
772 Views
Hello,

We share most ofyour equirements above and plan an OpenCL EXT extension to execute NDRaange and task commands in a single-threaded fashion by the host thread that will apply the corresponding clEnqueue commands. This extension is planned for our next major release.


Thanks, --Yariv
0 Kudos
Doron_S_Intel
Employee
772 Views
Hello,

Please try out the "immediate command execution" extension included in our latest release, and let us know whether the feature is useful to you and whether it answers your needs.

Hoping to hear from you,
Doron
0 Kudos
janez-makovsek
New Contributor I
772 Views
Hello,

Thank you Sir. This looks very promising. If I understand correctly, the extension needs to be specified in to the Open CL source?

To call the same kernel from multiple threads requires a code rebuild?

Thanks!
Atmapuri
0 Kudos
Doron_S_Intel
Employee
772 Views
I'm not sure I fully understood your questions - please don't hesitate to ask again if I've missed the point.

To use the extension, create another command queue and pass the property CL_QUEUE_IMMEDIATE_EXECUTION_ENABLE_INTEL which is defined in the cl_ext.h header file that comes bundled with the SDK. Commands enqueued to that command queue will execute in the direct manner described.

To enqueue to this queue from multiple threads, I first of all recommend you combine the mode above with CL_QUEUE_OUT_OF_ORDER_EXEC_ENABLE when you create the queue, so threads don't block each other (which is required by OpenCL in-order queue semantics). Then, that cl_command_queue handle can be shared freely between threads.

Thanks,
Doron
0 Kudos
Reply