- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
VTune does not recognize OpenCL SDK/MediaSDK. Which is the test that VTune does to locate Intel OpenCL SDK or Intel Media SDK?
What is wrong with my VTune? (I am using the 2018 version).
I have both installed.
If I go to Platform Analysis, GPU Hotspots it says: "Cannot collect GPU hardware metrics. Make sure the Intel OpenCL SDK or Intel Media SDK is installed.".
I am running amplxe-gui as root, and It does not matter what I select, the checkbox "Trace OpenCL and Intel Media SDK programs (Intel Graphics Driver only)" is always unchecked under Advanced Hotspots.
I am in an Arch Linux with vtsspp loaded correctly and Intel Core i5-6200U and HD Graphics 520.
I ran the self-checker script and this is the output (I attach the log):
$ sudo ./amplxe-self-checker.sh Intel(R) VTune(TM) Amplifier Self Check Utility Copyright (C) 2009-2017 Intel Corporation. All rights reserved. Build Number: 525261 Instrumentation based analysis check Example of analysis types: Hotspots, Concurrency, Locks and Waits Collection: Ok amplxe: Warning: Can't find 32-bit pin tool. 32-bit processes will not be profiled. Finalization: Ok Report: Fail HW event-based analysis check (Intel driver) Example of analysis types: Advanced Hotspots, HPC Performance Characterization, etc. Collection: Ok amplxe: Warning: To enable hardware event-base sampling, VTune Amplifier has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes. Finalization: Ok Report: Fail HW event-based analysis check (Intel driver) Example of analysis types: General Exploration Collection: Ok amplxe: Warning: To enable hardware event-base sampling, VTune Amplifier has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes. Finalization: Ok Report: Fail HW event-based analysis with uncore events (Intel driver) Example of analysis types: Memory Access Collection: Ok amplxe: Warning: To enable hardware event-base sampling, VTune Amplifier has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes. Finalization: Ok Report: Fail HW event-based analysis with stacks (Intel driver) Example of analysis types: Advanced Hotspots with Stacks, etc. Collection: Ok amplxe: Warning: To enable hardware event-base sampling, VTune Amplifier has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes. Finalization: Ok Report: Fail The check observed a product failure on your system. Review errors in the output above to fix a problem or contact Intel technical support. Log location: /tmp/amplxe-tmp-root/self-checker-2017.12.11_13.52.17/log.txt
It fails in everyone, although If I run VTune it works, although it doesn't show anything related with OpenCL (that is what I need).
And here clinfo:
Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.0 Platform Name: Intel(R) OpenCL Platform Vendor: Intel(R) Corporation Platform Extensions: cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_image2d_from_buffer cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_spir Platform Name: Intel(R) OpenCL Number of devices: 2 Device Type: CL_DEVICE_TYPE_GPU Vendor ID: 8086h Max compute units: 24 Max work items dimensions: 3 Max work items[0]: 256 Max work items[1]: 256 Max work items[2]: 256 Max work group size: 256 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 1 Preferred vector width float: 1 Preferred vector width double: 1 Native vector width char: 16 Native vector width short: 8 Native vector width int: 4 Native vector width long: 1 Native vector width float: 1 Native vector width double: 1 Max clock frequency: 1000Mhz Address bits: 64 Max memory allocation: 3296224870 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 128 Max image 2D width: 16384 Max image 2D height: 16384 Max image 3D width: 16384 Max image 3D height: 16384 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 1024 Alignment (bits) of base address: 1024 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: Read/Write Cache line size: 64 Cache size: 524288 Global memory size: 6592449741 Constant buffer size: 3296224870 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 65536 Max pipe arguments: 16 Max pipe active reservations: 1 Max pipe packet size: 1024 Max global variable size: 65536 Max global variable preferred total size: 3296224870 Max read/write image args: 128 Max on device events: 1024 Queue on device max size: 67108864 Max on device queues: 1 Queue on device preferred size: 131072 SVM capabilities: Coarse grain buffer: Yes Fine grain buffer: No Fine grain system: No Atomics: No Preferred platform atomic alignment: 64 Preferred global atomic alignment: 64 Preferred local atomic alignment: 64 Kernel Preferred work group size multiple: 32 Error correction support: 0 Unified memory for Host and Device: 1 Profiling timer resolution: 83 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue on Host properties: Out-of-Order: Yes Profiling : Yes Queue on Device properties: Out-of-Order: Yes Profiling : Yes Platform ID: 0x11b1110 Name: Intel(R) HD Graphics Vendor: Intel(R) Corporation Device OpenCL C version: OpenCL C 2.0 Driver version: r4.0.59481 Profile: FULL_PROFILE Version: OpenCL 2.0 Extensions: cl_intel_accelerator cl_intel_advanced_motion_estimation cl_intel_device_side_avc_motion_estimation cl_intel_driver_diagnostics cl_intel_media_block_io cl_intel_motion_estimation cl_intel_planar_yuv cl_intel_packed_yuv cl_intel_required_subgroup_size cl_intel_subgroups cl_intel_subgroups_short cl_intel_va_api_media_sharing cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_fp16 cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_image2d_from_buffer cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_khr_spir cl_khr_subgroups Device Type: CL_DEVICE_TYPE_CPU Vendor ID: 8086h Max compute units: 4 Max work items dimensions: 3 Max work items[0]: 8192 Max work items[1]: 8192 Max work items[2]: 8192 Max work group size: 8192 Preferred vector width char: 1 Preferred vector width short: 1 Preferred vector width int: 1 Preferred vector width long: 1 Preferred vector width float: 1 Preferred vector width double: 1 Native vector width char: 32 Native vector width short: 16 Native vector width int: 8 Native vector width long: 4 Native vector width float: 8 Native vector width double: 4 Max clock frequency: 2300Mhz Address bits: 64 Max memory allocation: 2062761984 Image support: Yes Max number of images read arguments: 480 Max number of images write arguments: 480 Max image 2D width: 16384 Max image 2D height: 16384 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 480 Max size of kernel argument: 3840 Alignment (bits) of base address: 1024 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: No Round to +ve and infinity: No IEEE754-2008 fused multiply-add: No Cache type: Read/Write Cache line size: 64 Cache size: 262144 Global memory size: 8251047936 Constant buffer size: 131072 Max number of constant args: 480 Local memory type: Global Local memory size: 32768 Max pipe arguments: 16 Max pipe active reservations: 65535 Max pipe packet size: 1024 Max global variable size: 65536 Max global variable preferred total size: 65536 Max read/write image args: 480 Max on device events: 4294967295 Queue on device max size: 4294967295 Max on device queues: 4294967295 Queue on device preferred size: 4294967295 SVM capabilities: Coarse grain buffer: Yes Fine grain buffer: No Fine grain system: No Atomics: No Preferred platform atomic alignment: 64 Preferred global atomic alignment: 64 Preferred local atomic alignment: 0 Kernel Preferred work group size multiple: 128 Error correction support: 0 Unified memory for Host and Device: 1 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: Yes Queue on Host properties: Out-of-Order: Yes Profiling : Yes Queue on Device properties: Out-of-Order: Yes Profiling : Yes Platform ID: 0x11b1110 Name: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz Vendor: Intel(R) Corporation Device OpenCL C version: OpenCL C 2.0 Driver version: 1.2.0.400 Profile: FULL_PROFILE Version: OpenCL 2.0 (Build 400) Extensions: cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_3d_image_writes cl_intel_exec_by_local_thread cl_khr_spir cl_khr_fp64 cl_khr_image2d_from_buffer
Link Copied
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page