OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1718 Discussions

OCL VME extension version 2 (VME++)

wu__frank
Beginner
637 Views

I know the VME built-in functions can be used by any GEN device that supports the cl_intel_device_side_avc_motion_estimation extension. This includes GEN8 and subsequent architectures.

Does it means that must require as least Intel 5th CPU Architechure and subsequent?

0 Kudos
1 Solution
Jeffrey_M_Intel1
Employee
637 Views

Here is how the processor and GPU generations align:

  • 4th Generation Core (Haswell) = Gen 7.5 GPU
  • 5th Generation Core (Broadwell) = Gen 8 GPU
  • 6th/7th Generation Core (Skylake/Kabylake) = Gen 9 GPU

This article has more info: https://software.intel.com/en-us/articles/driver-support-matrix-for-media-sdk-and-opencl.

Device side VME was recently added for Skylake.  If you don't need device-side kernel launch you can look at the older host-side VME extensions which are available for a wider range of Gen versions: https://software.intel.com/en-us/articles/intro-to-advanced-motion-estimation-extension-for-opencl.

More info on the new device-side launch VME is here: https://software.intel.com/en-us/articles/intro-ds-vme

 

 

 

 

 

View solution in original post

0 Kudos
5 Replies
Jeffrey_M_Intel1
Employee
638 Views

Here is how the processor and GPU generations align:

  • 4th Generation Core (Haswell) = Gen 7.5 GPU
  • 5th Generation Core (Broadwell) = Gen 8 GPU
  • 6th/7th Generation Core (Skylake/Kabylake) = Gen 9 GPU

This article has more info: https://software.intel.com/en-us/articles/driver-support-matrix-for-media-sdk-and-opencl.

Device side VME was recently added for Skylake.  If you don't need device-side kernel launch you can look at the older host-side VME extensions which are available for a wider range of Gen versions: https://software.intel.com/en-us/articles/intro-to-advanced-motion-estimation-extension-for-opencl.

More info on the new device-side launch VME is here: https://software.intel.com/en-us/articles/intro-ds-vme

 

 

 

 

 

0 Kudos
wu__frank
Beginner
637 Views

Jeffrey M. (Intel) wrote:

Here is how the processor and GPU generations align:

  • 4th Generation Core (Haswell) = Gen 7.5 GPU
  • 5th Generation Core (Broadwell) = Gen 8 GPU
  • 6th/7th Generation Core (Skylake/Kabylake) = Gen 9 GPU

This article has more info: https://software.intel.com/en-us/articles/driver-support-matrix-for-media-sdk-and-opencl.

Device side VME was recently added for Skylake.  If you don't need device-side kernel launch you can look at the older host-side VME extensions which are available for a wider range of Gen versions: https://software.intel.com/en-us/articles/intro-to-advanced-motion-estimation-extension-for-opencl

Dear Jeffrey,

Thanks a lot, Your answer make me more clearly about Intel how the processor and GPU generations align. One more question about VME 1.0 performance. 

When I use VME 1.0 to obtain frame based mv information, I need clEnqueueWriteBuffer source and reference frame into GPU memory, then I need Run this kernel, and last I need read result with CL_TURE flag. I want to know this part time proportion include data transfer and executive time. Thanks to Jeffrey.

 

 

 

 

0 Kudos
Jeffrey_M_Intel1
Employee
637 Views

The examples use this pattern:

  • Copy the first source image to tiled (GPU) memory
  • Loop over images
    • swap source and ref image
    • copy a new source image to GPU memory
    • copy 

There are faster ways to transfer the frames and buffers (frames could originate from GPU decode, as from Media SDK, buffers could be mapped).  The goal in the sample code is to show functionality, not necessarily best performance.  CL_TRUE means the transfer is blocking, which should be enough for timing.  You could add a clFinish after each frame just to be sure.  The best way to do timings is of course to start before the first frame copy in and stop when the last results are available to the CPU.  This will have all the data transfer time and compute time.

0 Kudos
wu__frank
Beginner
637 Views

Dear Jeffrey,

Recently I want to use cl_intel_device_side_avc_motion_estimation extension to do some experimental and you said this extension had added for Skylake. But on my PC enviroment, I use OpenCL API function clGetDeviceInfo to obtain CL_DEVICE_EXTENSIONS which not support cl_intel_device_side_avc_motion_estimation. I wonder if the machine is not new enough and what can I do to use this extensions?

 

Detail parameters of PC as follows

CPU: Intel Core(TM) i5-6500

GPU: Intel HD Graphics 530

0 Kudos
Jeffrey_M_Intel1
Employee
637 Views

Your hardware is 6th Generation Core, which supports this feature.  However, for now, the device side VME extension is only available for Linux.    The Windows driver updates are coming but not available for download yet. 

Thank you for pointing this out.  I have updated the article to make this clearer.

0 Kudos
Reply