OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1630 Discussions

Example of app submitting work directly to GPU (bypassing KMD) ?


Is there example code for the use case where the user app wants to submit a workload directly to the (gen9) GPU (ie. bypassing the kernel mode driver)? 

0 Kudos
2 Replies

Hello (name withheld),

Can you expand on the goal and define what you mean by 'submit'? 

You may be interested in the c for media project.

You may also be interested in precompiling kernels prior to execution, to something like a SPIR intermediate target... or a target specific executable... See the -x spir toggle here for an example.

Other than that I'm not aware of anyway of bypassing a mechanism to access gen9.





Hi MichaelC,

I am referring to what I think is described by the patent here. By "submit" I mean enqueue a context to some controller in the GPU from the user application/opencl runtime directly, rather than having the kernel mode driver submit on behalf of the application. I believe there is some way to do so as long as there is some initial agreement set up between the user application and the kernel mode driver. I think it is also alluded to by the code located on this page by the "ContextIndex" and "SubmissionByProxy" values in the following struct: 

// PURPOSE: To represent the context ID structure and execlist/submit queues
typedef struct UK_CONTEXT_ID_MAP_REC
            ULONG    ContextIndex          : KM_BIT_RANGE(  19,  0);  // NOTE: This can be index in the app context pool in direct submission case or LRCA itself in proxy submission case
            ULONG    SubmissionByProxy     : KM_BIT_RANGE(  20, 20);  // If KMD or other context submitted this context. This means, ContextID is LRCA[31:20]
            ULONG    Reserved              : KM_BIT_RANGE(  22, 21);  // Required by HW
            ULONG    SWCounter             : KM_BIT_RANGE(  28, 23);  // Used for tracking IOMMU group resubmits (or if submit by proxy is true, lower 6 bits QWIndex).
            ULONG    EngineId              : KM_BIT_RANGE(  31, 29);
        ULONG                        ContextIdDword;