OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1718 Discussions

How to install OpenCL CPU runtime 18.1 without removing Intel Graphics Drivers???

Andy_R
Beginner
5,617 Views

We use the Intel CPU OpenCL runtime libraries widely throughout our firm, but we are now in the position that we are stuck with out of date Intel runtime drivers due to the irksome nature of the Intel installer for the latest available version, 18.1. Specifically the installer requires that the machines Intel HD Graphics drivers are uninstalled before it will proceed and there appears to be no way of proceeding without this. This is a huge problem.

 

Up to now, our IT department has rolled out updates by simply adding the latest Intel installer to the list of available software in our internal database, which can then be deployed and run automatically. However, in this case it is impossible due to this extremely difficult prerequisite of removing the graphics drivers.

 

Can anyone at Intel please advise how they expect large clients to roll out the 18.1 CPU OpenCL runtime drivers internally? It's not reasonable to expect the local graphics drivers to be uninstalled, and then later re-installed, as part of an automated roll-out to 100s of machines. This is a huge headache.

 

Surely there must be some way of updating to the 18.1 OpenCL CPU runtime driver without removing the Intel graphics drivers?

 

This problem has us completely stuck, so any assistance with how we are meant to proceed would be greatly appreciated.

0 Kudos
3 Replies
Michael_C_Intel1
Moderator
5,617 Views

Hello AndyR,

Thanks for the interest and the feedback. Some followup questions... Can you describe the systems you're running on? What (edit) *CPU* SKU/CPU model and what OS? What system vendor? What system model? Are your systems used for OpenCL software development?

Some context on Windows:

  • The latest graphics driver packages contain both Intel® CPU Runtime for OpenCL™ Applications 18.1 (CPU) and Intel® Graphics Compute Runtime for OpenCL™ Driver (iGFX) implementations.
  • On Windows, the CPU Runtime will not install on systems with the graphics driver package installed because it should already be resident... (edit for clarity) Also, the graphics driver configures the system registry to expose both iGFX and CPU OpenCL platforms. The CPU Runtime installer configures just for the CPU Runtime.
  • The caviat is vendors are allowed to adopt and repackage Intel reference drivers at their discretion. Support and or functionality may be disabled by the vendor for graphics drivers from the Intel downloadcenter. For graphics driver packages, administrators should look to the vendor drivers first before trying downloadcenter drivers.

When you say:

expect the local graphics drivers to be uninstalled, and then later re-installed,

Why is the graphics driver reinstalled? Why is the CPU Runtime installer used on these systems?... (edit) Can you share your goals for OpenCL deployment on these systems?

On Windows* OS, the CPU Runtime installer is most useful and appropriate on systems without Intel® Graphics Technology.

 

-MichaelC

 

 

0 Kudos
Andy_R
Beginner
5,617 Views

Hi Michael

Thanks for the response.

Our typical target machine would be a standard Lenovo Thinkcentre M920S Tower, with on-board Intel Core i7-8700 CPU @ 3.20GHz and Intel UHD Graphics 630, running a 64-bit Windows 10 Enterprise operating system.

I believe that it's the standard Intel Graphics driver that is distributed in the build for the on-board Intel UHD Graphics 630 that is causing the issue, as when the Intel CPU Runtime for OpenCL Applications 18.1 installer is run it immediately states that the Intel HD Graphics driver must be uninstalled.

We heavily use OpenCL for our internally developed pricing language as part of our own C++ analytics libraries, which are distributed as part of an Excel add-in build. When run from our Excel add-in, our scripts generate raw native OpenCL code which it then internally compiles, builds and executes on the CPU using the Intel CPU Runtime for OpenCL interface (and associated libraries). This is something that is core to the functionality of the analytics libraries and has been working extremely well for quite some time now.

I have managed to get the Intel CPU Runtime for OpenCL Applications 18.1 to install on a test machine (as specified above), but only by first removing the Intel UHD Graphics 630 graphics drivers. After this, I do need to re-install these graphics drivers though, as it seems they are required to drive the multiple monitor set up we typically have (as after uninstalling the graphics driver and then installing the Intel CPU Runtime for OpenCL drivers the machine is restricted to a single monitor, indicating some functionality has been lost).

While the process described does seem to work, it's not something we can easily roll out across the whole firm, so I was hoping that you would please be able to advise a less painful upgrade route for our  Intel CPU Runtime for OpenCL Applications drivers - which currently are stuck way back on version 5.0.0.57 throughout our entire user base.

Note that our use of OpenCL functionality on the Intel CPU is very mature. It has been distributed and used heavily for around 5 or 6 years now, without any major issues. That is until now, where we have hit this problem with the 18.1 installer.

Any advice you are able to give on how we might proceed would be very gratefully received.

Regards,

Andy

 

 

 

 

0 Kudos
Michael_C_Intel1
Moderator
5,617 Views

Hi AndyR,

Thanks for the detail. Sounds like a pretty cool, clever, and useful project. My post below is verbose, but I'm hoping it provides context to be as functional as possible.

Given the deployment details... It doesn't sound like the 18.1 standalone installer is appropriate for enabling that application on that platform... A speculative guess is that the application may need to refactor it's platform and device picking process.... and likely seek an updated iGFX driver package. Some guidance:

1) Orient toward the graphics driver and forgo the 18.1 standalone installer.

Intel® Core™ i7-8700 should align with the DCH Windows 10 OS drivers available on the downloadcenter. As of today there is a 20200116 release that  is valid for your SKU. It contains both the CPU and iGFX OpenCL implementations. However, consider obtaining it direct from the vendor (in this case Lenovo) as A) there might be platform support implications for using Lenovo drivers from Lenovo... and B) it's possible Lenovo's system exposes devices in a way that would not allow the download center driver to work... This would force usage of the vendor party graphics driver package.

For your system it appears Lenovo's website has a package from 20191204 that would repackage some of the Intel reference driver bits. This may be the best place to start... I'd expect any Intel® Core™ Processor vendor graphics driver after Spring 2019 to have DCH driver bits aligned with the driver branch indicated in the download center. The 18.1 stand alone installer from Intel is not really oriented for use on the device....

Sidebar (if it were deployed): the 18.1 stand alone would apply to a fresh system has iGFX disabled per the system bios... prior to Windows OS install. When Windows is installed, it would not put a Windows preincluded Intel Graphics driver on the system... like it would if the bios was enabled. The 18.1 standalone should not be used by an end user on a Windows system with Intel® Graphics Technology enabled.

2) Connections

Some vendor BIOSes... and Windows itself have been reported to not expose the Intel Graphics device when it is not connected to the display. I recommend checking to ensure that it is connected to the display and that graphics device is enabled in the vendor BIOS.

3) Platform naming... mitigations...

OpenCL has iterated as a standard and Intel implementations have changed in kind since the older releases (like 5.0.057). It's possible that platform naming reported from newer OpenCL runtimes would be incompatible with applications that hard code old platform names. I recommend looking at how both an OpenCL platform names and device names are exposed. If the application detection code looks for either a platform name or device name that has changed in newer implementations... or changes between the CPU standalone and CPU+iGFX combo package... this could necessitate application refactor.

In my experience, clinfo is a good tool to see how platforms report themselves through the OpenCL ICD Loader library (OpenCL.dll). I recommend reviewing the platform names (and device names) exposed from the runtimes intended for platforms matched with your application. The platform name used in the application software could be a reason for incompatibility. In the interest of coverage, application software would typically need to validate against all supported implementations and deployments for said software.

Consider: Any device could be exposed through it's own separate platform and device identifiers... separate from other devices that may use the same apparent OpenCL implementation... Also separate from the same device with different OpenCL implementations.... Also separate from the same device with different host OS's. This could change version to version with the same driver build.

In the Intel instance, naming for the platform doesn't change that often, but it has changed and it is different amongst OS's as well as distributions. Application software needs to be flexible enough to adapt.

4) Default driver package

Windows 10 OS out of box will deploy the Intel® Graphics Technology driver and thus the CPU+iGFX combo package. The challenge with this is that this would typically be an older implementation as the OS itself is likely more static for driver repackaging than vendors or the Intel downloadcenter. Upgrading would be advised... All though clinfo can be used here to get an expectation of how the platform can be interrogated.

5) On Versioning...

From where was 5.0.057 acquired? What version of Intel® Graphics Technology driver reports on the system before Intel® Graphics Driver is uninstalled? When reinstalled?

Vendor versioning may differ from Intel versioning listed on the downloadcenter.

6) Example

For reference here is clinfo interrogation on Skylake with Intel Graphics Technology interrogated today... The graphics driver CPU and iGFX implementations are used. See Platform Name: Intel(R) OpenCL and Device Names: Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz  and Intel(R) HD Graphics 520. These may not be the same names for older drivers, for Linux OS, nor the same for the older CPU standalone implementation, nor the current CPU standalone.

 

>clinfo
Number of platforms                               1
  Platform Name                                   Intel(R) OpenCL
  Platform Vendor                                 Intel(R) Corporation
  Platform Version                                OpenCL 2.1
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_image2d_from_buffer cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_spir
  Platform Host timer resolution                  100ns
  Platform Extensions function suffix             INTEL

  Platform Name                                   Intel(R) OpenCL
Number of devices                                 2
  Device Name                                     Intel(R) HD Graphics 520
  Device Vendor                                   Intel(R) Corporation
  Device Vendor ID                                0x8086
  Device Version                                  OpenCL 2.1 NEO
  Driver Version                                  26.20.100.7584
  Device OpenCL C Version                         OpenCL C 2.0
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               24
  Max clock frequency                             1000MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              32
  Max sub-groups per work group                   32
  Sub-group sizes (Intel)                         8, 16, 32
  Preferred / native vector sizes
    char                                                16 / 16
    short                                                8 / 8
    int                                                  4 / 4
    long                                                 1 / 1
    half                                                 8 / 8        (cl_khr_fp16)
    float                                                1 / 1
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              6821855232 (6.353GiB)
  Error Correction support                        No
  Max memory allocation                           3410927616 (3.177GiB)
  Unified memory for Host and Device              Yes
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   Yes
    Fine-grained system sharing                   No
    Atomics                                       Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Preferred alignment for atomics
    SVM                                           64 bytes
    Global                                        64 bytes
    Local                                         64 bytes
  Max size for global variable                    65536 (64KiB)
  Preferred total size of global vars             3410927616 (3.177GiB)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        524288 (512KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            213182976 pixels
    Max 1D or 2D image array size                 2048 images
    Base address alignment for 2D image buffers   4 bytes
    Pitch alignment for 2D image buffers          4 pixels
    Max 2D image size                             16384x16384 pixels
    Max planar YUV image size                     16384x16352 pixels
    Max 3D image size                             16384x16384x2048 pixels
    Max number of read image args                 128
    Max number of write image args                128
    Max number of read/write image args           128
  Max number of pipe args                         16
  Max active pipe reservations                    1
  Max pipe packet size                            1024
  Local memory type                               Local
  Local memory size                               65536 (64KiB)
  Max number of constant args                     8
  Max constant buffer size                        3410927616 (3.177GiB)
  Max size of kernel argument                     1024
  Queue properties (on host)
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Queue properties (on device)
    Out-of-order execution                        Yes
    Profiling                                     Yes
    Preferred size                                131072 (128KiB)
    Max size                                      67108864 (64MiB)
  Max queues on device                            1
  Max events on device                            1024
  Prefer user sync for interop                    Yes
  Number of simultaneous interops (Intel)         1
  Simultaneous interops                           GL WGL D3D9 (KHR) D3D9 (INTEL) D3D9Ex (KHR) D3D9Ex (INTEL) DXVA (KHR) DXVA (INTEL) D3D10 D3D11
  Profiling timer resolution                      83ns
  Execution capabilities
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Sub-group independent forward progress        Yes
    IL version                                    SPIR-V_1.2
    SPIR versions                                 1.2
  printf() buffer size                            4194304 (4MiB)
  Built-in kernels                                block_motion_estimate_intel;block_advanced_motion_estimate_check_intel;block_advanced_motion_estimate_bidirectional_check_intel;
  Motion Estimation accelerator version (Intel)   2
    Device-side AVC Motion Estimation version     1
      Supports texture sampler use                Yes
      Supports preemption                         No
  Device Extensions                               cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_depth_images cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_image2d_from_buffer cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_intel_subgroups cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir cl_intel_accelerator cl_intel_media_block_io cl_intel_driver_diagnostics cl_khr_priority_hints cl_khr_throttle_hints cl_khr_create_command_queue cl_khr_fp64 cl_khr_subgroups cl_khr_il_program cl_intel_spirv_device_side_avc_motion_estimation cl_intel_spirv_media_block_io cl_intel_spirv_subgroups cl_khr_spirv_no_integer_wrap_decoration cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_intel_unified_shared_memory_preview cl_intel_planar_yuv cl_intel_packed_yuv cl_intel_motion_estimation cl_intel_device_side_avc_motion_estimation cl_intel_advanced_motion_estimation cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_gl_sharing cl_khr_gl_depth_images cl_khr_gl_event cl_khr_gl_msaa_sharing cl_intel_dx9_media_sharing cl_khr_dx9_media_sharing cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_intel_d3d11_nv12_media_sharing cl_intel_unified_sharing cl_intel_simultaneous_sharing

  Device Name                                     Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz
  Device Vendor                                   Intel(R) Corporation
  Device Vendor ID                                0x8086
  Device Version                                  OpenCL 2.1 (Build 0)
  Driver Version                                  7.6.0.0814
  Device OpenCL C Version                         OpenCL C 2.0
  Device Type                                     CPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               4
  Max clock frequency                             2400MHz
  Device Partition                                (core)
    Max number of sub-devices                     4
    Supported partition types                     by counts, equally, by names (Intel)
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             8192x8192x8192
  Max work group size                             8192
  Preferred work group size multiple              128
  Max sub-groups per work group                   1
  Preferred / native vector sizes
    char                                                 1 / 32
    short                                                1 / 16
    int                                                  1 / 8
    long                                                 1 / 4
    half                                                 0 / 0        (n/a)
    float                                                1 / 8
    double                                               1 / 4        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              17054646272 (15.88GiB)
  Error Correction support                        No
  Max memory allocation                           4263661568 (3.971GiB)
  Unified memory for Host and Device              Yes
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   Yes
    Fine-grained system sharing                   Yes
    Atomics                                       Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Preferred alignment for atomics
    SVM                                           64 bytes
    Global                                        64 bytes
    Local                                         0 bytes
  Max size for global variable                    65536 (64KiB)
  Preferred total size of global vars             65536 (64KiB)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        262144 (256KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             480
    Max size for 1D images from buffer            266478848 pixels
    Max 1D or 2D image array size                 2048 images
    Base address alignment for 2D image buffers   64 bytes
    Pitch alignment for 2D image buffers          64 pixels
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 480
    Max number of write image args                480
    Max number of read/write image args           480
  Max number of pipe args                         16
  Max active pipe reservations                    65535
  Max pipe packet size                            1024
  Local memory type                               Global
  Local memory size                               32768 (32KiB)
  Max number of constant args                     480
  Max constant buffer size                        131072 (128KiB)
  Max size of kernel argument                     3840 (3.75KiB)
  Queue properties (on host)
    Out-of-order execution                        Yes
    Profiling                                     Yes
    Local thread execution (Intel)                Yes
  Queue properties (on device)
    Out-of-order execution                        Yes
    Profiling                                     Yes
    Preferred size                                4294967295 (4GiB)
    Max size                                      4294967295 (4GiB)
  Max queues on device                            4294967295
  Max events on device                            4294967295
  Prefer user sync for interop                    No
  Profiling timer resolution                      100ns
  Execution capabilities
    Run OpenCL kernels                            Yes
    Run native kernels                            Yes
    Sub-group independent forward progress        No
    IL version                                    SPIR-V_1.0
    SPIR versions                                 1.2
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                (n/a)
  Device Extensions                               cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_3d_image_writes cl_intel_exec_by_local_thread cl_khr_spir cl_khr_fp64 cl_khr_image2d_from_buffer cl_intel_vec_len_hint

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [INTEL]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 Intel(R) OpenCL
    Device Name                                   Intel(R) HD Graphics 520
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  Success (1)
    Platform Name                                 Intel(R) OpenCL
    Device Name                                   Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 Intel(R) OpenCL
    Device Name                                   Intel(R) HD Graphics 520
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  Invalid device type for platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (2)
    Platform Name                                 Intel(R) OpenCL
    Device Name                                   Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz
    Device Name                                   Intel(R) HD Graphics 520

-MichaelC

0 Kudos
Reply