- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
VTune does not recognize OpenCL SDK/MediaSDK. Which is the test that VTune does to locate Intel OpenCL SDK or Intel Media SDK?
What is wrong with my VTune? (I am using the 2018 version).
I have both installed.
If I go to Platform Analysis, GPU Hotspots it says: "Cannot collect GPU hardware metrics. Make sure the Intel OpenCL SDK or Intel Media SDK is installed.".
I am running amplxe-gui as root, and It does not matter what I select, the checkbox "Trace OpenCL and Intel Media SDK programs (Intel Graphics Driver only)" is always unchecked under Advanced Hotspots.
I am in an Arch Linux with vtsspp loaded correctly and Intel Core i5-6200U and HD Graphics 520.
I ran the self-checker script and this is the output (I attach the log):
$ sudo ./amplxe-self-checker.sh
Intel(R) VTune(TM) Amplifier Self Check Utility
Copyright (C) 2009-2017 Intel Corporation. All rights reserved.
Build Number: 525261
Instrumentation based analysis check
Example of analysis types: Hotspots, Concurrency, Locks and Waits
Collection: Ok
amplxe: Warning: Can't find 32-bit pin tool. 32-bit processes will not be profiled.
Finalization: Ok
Report: Fail
HW event-based analysis check (Intel driver)
Example of analysis types: Advanced Hotspots, HPC Performance Characterization, etc.
Collection: Ok
amplxe: Warning: To enable hardware event-base sampling, VTune Amplifier has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes.
Finalization: Ok
Report: Fail
HW event-based analysis check (Intel driver)
Example of analysis types: General Exploration
Collection: Ok
amplxe: Warning: To enable hardware event-base sampling, VTune Amplifier has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes.
Finalization: Ok
Report: Fail
HW event-based analysis with uncore events (Intel driver)
Example of analysis types: Memory Access
Collection: Ok
amplxe: Warning: To enable hardware event-base sampling, VTune Amplifier has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes.
Finalization: Ok
Report: Fail
HW event-based analysis with stacks (Intel driver)
Example of analysis types: Advanced Hotspots with Stacks, etc.
Collection: Ok
amplxe: Warning: To enable hardware event-base sampling, VTune Amplifier has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes.
Finalization: Ok
Report: Fail
The check observed a product failure on your system.
Review errors in the output above to fix a problem or contact Intel technical support.
Log location: /tmp/amplxe-tmp-root/self-checker-2017.12.11_13.52.17/log.txt
It fails in everyone, although If I run VTune it works, although it doesn't show anything related with OpenCL (that is what I need).
And here clinfo:
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 2.0
Platform Name: Intel(R) OpenCL
Platform Vendor: Intel(R) Corporation
Platform Extensions: cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_image2d_from_buffer cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_spir
Platform Name: Intel(R) OpenCL
Number of devices: 2
Device Type: CL_DEVICE_TYPE_GPU
Vendor ID: 8086h
Max compute units: 24
Max work items dimensions: 3
Max work items[0]: 256
Max work items[1]: 256
Max work items[2]: 256
Max work group size: 256
Preferred vector width char: 16
Preferred vector width short: 8
Preferred vector width int: 4
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Native vector width char: 16
Native vector width short: 8
Native vector width int: 4
Native vector width long: 1
Native vector width float: 1
Native vector width double: 1
Max clock frequency: 1000Mhz
Address bits: 64
Max memory allocation: 3296224870
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 128
Max image 2D width: 16384
Max image 2D height: 16384
Max image 3D width: 16384
Max image 3D height: 16384
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 1024
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 524288
Global memory size: 6592449741
Constant buffer size: 3296224870
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 65536
Max pipe arguments: 16
Max pipe active reservations: 1
Max pipe packet size: 1024
Max global variable size: 65536
Max global variable preferred total size: 3296224870
Max read/write image args: 128
Max on device events: 1024
Queue on device max size: 67108864
Max on device queues: 1
Queue on device preferred size: 131072
SVM capabilities:
Coarse grain buffer: Yes
Fine grain buffer: No
Fine grain system: No
Atomics: No
Preferred platform atomic alignment: 64
Preferred global atomic alignment: 64
Preferred local atomic alignment: 64
Kernel Preferred work group size multiple: 32
Error correction support: 0
Unified memory for Host and Device: 1
Profiling timer resolution: 83
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue on Host properties:
Out-of-Order: Yes
Profiling : Yes
Queue on Device properties:
Out-of-Order: Yes
Profiling : Yes
Platform ID: 0x11b1110
Name: Intel(R) HD Graphics
Vendor: Intel(R) Corporation
Device OpenCL C version: OpenCL C 2.0
Driver version: r4.0.59481
Profile: FULL_PROFILE
Version: OpenCL 2.0
Extensions: cl_intel_accelerator cl_intel_advanced_motion_estimation cl_intel_device_side_avc_motion_estimation cl_intel_driver_diagnostics cl_intel_media_block_io cl_intel_motion_estimation cl_intel_planar_yuv cl_intel_packed_yuv cl_intel_required_subgroup_size cl_intel_subgroups cl_intel_subgroups_short cl_intel_va_api_media_sharing cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_fp16 cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_image2d_from_buffer cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_khr_spir cl_khr_subgroups
Device Type: CL_DEVICE_TYPE_CPU
Vendor ID: 8086h
Max compute units: 4
Max work items dimensions: 3
Max work items[0]: 8192
Max work items[1]: 8192
Max work items[2]: 8192
Max work group size: 8192
Preferred vector width char: 1
Preferred vector width short: 1
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Native vector width char: 32
Native vector width short: 16
Native vector width int: 8
Native vector width long: 4
Native vector width float: 8
Native vector width double: 4
Max clock frequency: 2300Mhz
Address bits: 64
Max memory allocation: 2062761984
Image support: Yes
Max number of images read arguments: 480
Max number of images write arguments: 480
Max image 2D width: 16384
Max image 2D height: 16384
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 480
Max size of kernel argument: 3840
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: No
Round to +ve and infinity: No
IEEE754-2008 fused multiply-add: No
Cache type: Read/Write
Cache line size: 64
Cache size: 262144
Global memory size: 8251047936
Constant buffer size: 131072
Max number of constant args: 480
Local memory type: Global
Local memory size: 32768
Max pipe arguments: 16
Max pipe active reservations: 65535
Max pipe packet size: 1024
Max global variable size: 65536
Max global variable preferred total size: 65536
Max read/write image args: 480
Max on device events: 4294967295
Queue on device max size: 4294967295
Max on device queues: 4294967295
Queue on device preferred size: 4294967295
SVM capabilities:
Coarse grain buffer: Yes
Fine grain buffer: No
Fine grain system: No
Atomics: No
Preferred platform atomic alignment: 64
Preferred global atomic alignment: 64
Preferred local atomic alignment: 0
Kernel Preferred work group size multiple: 128
Error correction support: 0
Unified memory for Host and Device: 1
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: Yes
Queue on Host properties:
Out-of-Order: Yes
Profiling : Yes
Queue on Device properties:
Out-of-Order: Yes
Profiling : Yes
Platform ID: 0x11b1110
Name: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz
Vendor: Intel(R) Corporation
Device OpenCL C version: OpenCL C 2.0
Driver version: 1.2.0.400
Profile: FULL_PROFILE
Version: OpenCL 2.0 (Build 400)
Extensions: cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_3d_image_writes cl_intel_exec_by_local_thread cl_khr_spir cl_khr_fp64 cl_khr_image2d_from_buffer
Link Copied
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page