- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am experiencing repeated crashes with the Sync Benchmark application on an Elkhart Lake GPU (Works fine on CPU). Below are the details and steps to reproduce the issue, along with some debug logs.
I have changed this sync_benchmark to run from a default duration of 10 seconds to 100 seconds and I get the clWaitForEvents consistently.
Environment:
- OS: Ubuntu 22.04
- OpenVINO Version: 2024.2
- Compute Runtime/OpenCL Version: 24.22.29735.20
- Model Used: Person Detection Retail-0002
The Sync Benchmark application consistently hangs and eventually crashes when attempting to run inference on the specified model. The application is executed with the following command:
./sync_benchmark /home/test/intel/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.xml GPU
Kernel Messages (dmesg): Multiple entries of GPU hangs, for instance:
i915 xxxxx [drm] GPU HANG: ecode 11:1:8ed9fff3, in sync_benchmark [4524]
I have tried following i915 parameters (note that the error occurs with and without these flags):
i915.enable_hangcheck=0
i915.request_timeout_ms=200000
intel_idle.max_cstate=1
i915.enable_dc=0
ahci.mobile_lpm_policy=1
i915.enable_psr2_sel_fetch=0
i915.enable_psr=0
vt.handoff=7
Any insights or solutions to address these crashes would be highly appreciated. I have attached the relevant outputs and configurations for reference.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi PrateekSingh,
Thanks for reaching out.
Have you configured the GPU with OpenVINO™ before? The Intel® graphics driver must be properly configured on the system. For Ubuntu 22.04 LTS, you have to download and install the deb packages published here and install the apt package ocl-icd-libopencl1 with the OpenCl ICD loader. You may refer to these Configurations for Intel® Processor Graphics (GPU) with OpenVINO™.
Meanwhile, there is a similar issue in the discussion below that might help you. You may try the workaround shared by our team:
https://github.com/openvinotoolkit/openvino/issues/9884#issuecomment-1021137278
Regards,
Aznie
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Aznie for your prompt response. Although, I have already done what you have suggested.
As highlighted under the environment section of my question, I have used Compute Runtime/OpenCL Version: 24.22.29735.20 (same as you suggested) and have configured the GPU following the steps here. I have also verified that GPU is being used via intel_gpu_top.
I also have disabled the hang check parameter as mentioned in the last section of the question:
i915.enable_hangcheck=0
I also have tweaked a few other i915 parameters but they do not affect this. Although it is worth mentioning that enable_hangcheck and request_timeout_ms do delay this error, the error persists nonetheless. Another detail is that I have checked it on a few other hardware (e.g. Apollo Lake) where I can run inference on GPU over long durations as opposed to Elkhart Lake, where the inference starts okay but crashes with clWaitForEvents after some time (differs on every run, but it happens after 10 seconds, that is why I had to modify your sync benchmark app to run for 100 seconds to reproduce this crash).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi PrateekSingh,
I will escalate this to the engineering team and update you once the information is available.
Regards,
Aznie
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi PrateekSingh,
The Sync_Benchmark tool is no longer supported by the latest version of OpenVINO. You can use Benchmark_App instead. Please try using Benchmark App with the latest driver and see if it works.
Regards,
Aznie
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Aznie,
I was able to reproduce the error using Benchmark App as well. This is the output:
./benchmark_app -d GPU -m ~/intel/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.xml -hint none -niter 10000 -nstreams 1
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.2.0-15519-5c0f38f83f6-releases/2024/2
[ INFO ]
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.2.0-15519-5c0f38f83f6-releases/2024/2
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 46.33 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ] data (node: data) : f32 / [N,C,H,W] / [1,3,544,992]
[ INFO ] im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]
[ INFO ] Network outputs:
[ INFO ] detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ] data (node: data) : u8 / [N,C,H,W] / [1,3,544,992]
[ INFO ] im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]
[ INFO ] Network outputs:
[ INFO ] detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 1516.28 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ] NETWORK_NAME: PVANet + R-FCN
[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ] PERF_COUNT: NO
[ INFO ] ENABLE_CPU_PINNING: NO
[ INFO ] MODEL_PRIORITY: MEDIUM
[ INFO ] GPU_HOST_TASK_PRIORITY: MEDIUM
[ INFO ] GPU_QUEUE_PRIORITY: MEDIUM
[ INFO ] GPU_QUEUE_THROTTLE: MEDIUM
[ INFO ] GPU_ENABLE_LOOP_UNROLLING: YES
[ INFO ] GPU_DISABLE_WINOGRAD_CONVOLUTION: NO
[ INFO ] CACHE_DIR:
[ INFO ] CACHE_MODE: optimize_speed
[ INFO ] PERFORMANCE_HINT: LATENCY
[ INFO ] EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ] COMPILATION_NUM_THREADS: 4
[ INFO ] NUM_STREAMS: 1
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] INFERENCE_PRECISION_HINT: f16
[ INFO ] DEVICE_ID: 0
[ INFO ] EXECUTION_DEVICES: GPU.0
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] data ([N,C,H,W], u8, [1,3,544,992], static): random (image/numpy array is expected)
[ INFO ] im_info ([H,W], f32, [1,6], static): random (binary data/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests using 1 streams for GPU, limits: 10000 iterations)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 194.28 ms
[ ERROR ] Exception from src/plugins/intel_gpu/src/runtime/ocl/ocl_event.cpp:56:
[GPU] clWaitForEvents, error code: -14
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi PrateekSingh,
Thanks for sharing your results. We are checking on this and will get back to you soon.
Regards,
Aznie
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Following up on this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi PrateekSingh
Sorry for taking some time on this case as we are constantly validating and figuring out the issue's root cause.
Which Series of Elkhart lake are you currently using?
you can find out by using this command 'lscpu'
As far as our concern, certain series of Elkhart lake might not supported.
Hope to hear from you soon.
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for looking into this. This is the lscpu output:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 39 bits physical, 48 bits virtual
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 150
Model name: Intel(R) Celeron(R) J6412 @ 2.00GHz
Stepping: 1
Frequency boost: enabled
CPU MHz: 800.000
CPU max MHz: 2001.0000
CPU min MHz: 800.0000
BogoMIPS: 3993.60
Virtualisation: VT-x
L1d cache: 128 KiB
L1i cache: 128 KiB
L2 cache: 1.5 MiB
L3 cache: 4 MiB
NUMA node0 CPU(s): 0-3
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT disabled
Vulnerability Retbleed: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled v
ia prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user
pointer sanitization
Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RS
B filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Vulnerable: No microcode
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtr
r pge mca cmov pat pse36 clflush dts acpi mmx f
xsr sse sse2 ss ht tm pbe syscall nx rdtscp lm
constant_tsc art arch_perfmon pebs bts rep_good
nopl xtopology nonstop_tsc cpuid aperfmperf ts
c_known_freq pni pclmulqdq dtes64 monitor ds_cp
l vmx est tm2 ssse3 sdbg cx16 xtpr pdcm sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer a
es xsave rdrand lahf_lm 3dnowprefetch cpuid_fau
lt epb cat_l2 cdp_l2 ssbd ibrs ibpb stibp ibrs_
enhanced tpr_shadow vnmi flexpriority ept vpid
ept_ad fsgsbase tsc_adjust smep erms rdt_a rdse
ed smap clflushopt clwb intel_pt sha_ni xsaveop
t xsavec xgetbv1 xsaves split_lock_detect dther
m ida arat pln pts umip waitpkg gfni rdpid movd
iri movdir64b md_clear flush_l1d arch_capabilit
ies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the lscpu output. We will analyze it and get back to you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I noticed you're not using the latest version of compute runtime driver. Do you have an option to try it? https://github.com/intel/compute-runtime/releases/tag/24.35.30872.22
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I will get back to you after trying this latest release, I hope the issue is fixed in this version.
However I noticed that Elkhart Lake is marked for Legacy Quality in 24.35.30872.22 release, whereas in the previous release, 24.22.29735.20 (The last release, which is what I have used) Elkhart has a Production Quality.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Understood, thank you for your feedback. If another version of compute driver does not help, I may have to consult this case with developers.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your patience, Prateek. Developers are asking for clinfo results and output of hello_query_device in order to understand your case better. Could you run those commands and post the output here?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Prateek, we haven't received a reply from you for 3 business days. If we don't receive the information we asked for for 4 more business days, we will have to close this case. Thank you for your understanding.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
This is the clinfo output, I'll add the device query output on Monday.
clinfo
Number of platforms 1
Platform Name Intel(R) OpenCL Graphics
Platform Vendor Intel(R) Corporation
Platform Version OpenCL 3.0
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_byte_addressable_store cl_khr_device_uuid cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_intel_command_queue_families cl_intel_subgroups cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir cl_intel_accelerator cl_intel_driver_diagnostics cl_khr_priority_hints cl_khr_throttle_hints cl_khr_create_command_queue cl_intel_subgroups_char cl_intel_subgroups_long cl_khr_il_program cl_intel_mem_force_host_memory cl_khr_subgroup_extended_types cl_khr_subgroup_non_uniform_vote cl_khr_subgroup_ballot cl_khr_subgroup_non_uniform_arithmetic cl_khr_subgroup_shuffle cl_khr_subgroup_shuffle_relative cl_khr_subgroup_clustered_reduce cl_intel_device_attribute_query cl_khr_suggested_local_work_size cl_intel_split_work_group_barrier cl_ext_float_atomics cl_khr_external_memory cl_intel_planar_yuv cl_intel_packed_yuv cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_3d_image_writes cl_intel_media_block_io cl_intel_subgroup_local_block_io cl_khr_gl_sharing cl_khr_gl_depth_images cl_khr_gl_event cl_khr_gl_msaa_sharing cl_intel_va_api_media_sharing cl_intel_sharing_format_query cl_khr_pci_bus_info
Platform Extensions with Version cl_khr_byte_addressable_store 0x400000 (1.0.0)
cl_khr_device_uuid 0x400000 (1.0.0)
cl_khr_fp16 0x400000 (1.0.0)
cl_khr_global_int32_base_atomics 0x400000 (1.0.0)
cl_khr_global_int32_extended_atomics 0x400000 (1.0.0)
cl_khr_icd 0x400000 (1.0.0)
cl_khr_local_int32_base_atomics 0x400000 (1.0.0)
cl_khr_local_int32_extended_atomics 0x400000 (1.0.0)
cl_intel_command_queue_families 0x400000 (1.0.0)
cl_intel_subgroups 0x400000 (1.0.0)
cl_intel_required_subgroup_size 0x400000 (1.0.0)
cl_intel_subgroups_short 0x400000 (1.0.0)
cl_khr_spir 0x400000 (1.0.0)
cl_intel_accelerator 0x400000 (1.0.0)
cl_intel_driver_diagnostics 0x400000 (1.0.0)
cl_khr_priority_hints 0x400000 (1.0.0)
cl_khr_throttle_hints 0x400000 (1.0.0)
cl_khr_create_command_queue 0x400000 (1.0.0)
cl_intel_subgroups_char 0x400000 (1.0.0)
cl_intel_subgroups_long 0x400000 (1.0.0)
cl_khr_il_program 0x400000 (1.0.0)
cl_intel_mem_force_host_memory 0x400000 (1.0.0)
cl_khr_subgroup_extended_types 0x400000 (1.0.0)
cl_khr_subgroup_non_uniform_vote 0x400000 (1.0.0)
cl_khr_subgroup_ballot 0x400000 (1.0.0)
cl_khr_subgroup_non_uniform_arithmetic 0x400000 (1.0.0)
cl_khr_subgroup_shuffle 0x400000 (1.0.0)
cl_khr_subgroup_shuffle_relative 0x400000 (1.0.0)
cl_khr_subgroup_clustered_reduce 0x400000 (1.0.0)
cl_intel_device_attribute_query 0x400000 (1.0.0)
cl_khr_suggested_local_work_size 0x400000 (1.0.0)
cl_intel_split_work_group_barrier 0x400000 (1.0.0)
cl_ext_float_atomics 0x400000 (1.0.0)
cl_khr_external_memory 0x9001 (0.9.1)
cl_intel_planar_yuv 0x400000 (1.0.0)
cl_intel_packed_yuv 0x400000 (1.0.0)
cl_khr_image2d_from_buffer 0x400000 (1.0.0)
cl_khr_depth_images 0x400000 (1.0.0)
cl_khr_3d_image_writes 0x400000 (1.0.0)
cl_intel_media_block_io 0x400000 (1.0.0)
cl_intel_subgroup_local_block_io 0x400000 (1.0.0)
cl_khr_gl_sharing 0x400000 (1.0.0)
cl_khr_gl_depth_images 0x400000 (1.0.0)
cl_khr_gl_event 0x400000 (1.0.0)
cl_khr_gl_msaa_sharing 0x400000 (1.0.0)
cl_intel_va_api_media_sharing 0x400000 (1.0.0)
cl_intel_sharing_format_query 0x400000 (1.0.0)
cl_khr_pci_bus_info 0x400000 (1.0.0)
Platform Numeric Version 0xc00000 (3.0.0)
Platform Extensions function suffix INTEL
Platform Host timer resolution 1ns
Platform Name Intel(R) OpenCL Graphics
Number of devices 1
Device Name Intel(R) UHD Graphics
Device Vendor Intel(R) Corporation
Device Vendor ID 0x8086
Device Version OpenCL 3.0 NEO
Device UUID 86805545-0100-0000-0002-000000000000
Driver UUID 32342e32-322e-3239-3733-352e32300000
Valid Device LUID No
Device LUID e049-7c2dfe7f0000
Device Node Mask 0
Device Numeric Version 0xc00000 (3.0.0)
Driver Version 24.22.29735.20
Device OpenCL C Version OpenCL C 1.2
Device OpenCL C all versions OpenCL C 0x400000 (1.0.0)
OpenCL C 0x401000 (1.1.0)
OpenCL C 0x402000 (1.2.0)
OpenCL C 0xc00000 (3.0.0)
Device OpenCL C features __opencl_c_int64 0xc00000 (3.0.0)
__opencl_c_3d_image_writes 0xc00000 (3.0.0)
__opencl_c_images 0xc00000 (3.0.0)
__opencl_c_read_write_images 0xc00000 (3.0.0)
Latest comfornace test passed v2024-02-27-00
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 16
Max clock frequency 800MHz
Device Partition (core)
Max number of sub-devices 0
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Preferred work group size multiple (device) 32
Preferred work group size multiple (kernel) 32
Max sub-groups per work group 0
Sub-group sizes (Intel) 8, 16, 32
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 1 / 1
half 8 / 8 (cl_khr_fp16)
float 1 / 1
double 0 / 0 (n/a)
Half-precision Floating-point support (cl_khr_fp16)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 32, Little-Endian
Global memory size 4037267456 (3.76GiB)
Error Correction support No
Max memory allocation 2018633728 (1.88GiB)
Unified memory for Host and Device Yes
Shared Virtual Memory (SVM) capabilities (core)
Coarse-grained buffer sharing No
Fine-grained buffer sharing No
Fine-grained system sharing No
Atomics No
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Preferred alignment for atomics
SVM 64 bytes
Global 64 bytes
Local 64 bytes
Atomic memory capabilities relaxed, work-group scope
Atomic fence capabilities relaxed, acquire/release, work-group scope
Max size for global variable 0
Preferred total size of global vars 0
Global Memory cache type Read/Write
Global Memory cache size 1310720 (1.25MiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 126164608 pixels
Max 1D or 2D image array size 2048 images
Base address alignment for 2D image buffers 4 bytes
Pitch alignment for 2D image buffers 4 pixels
Max 2D image size 16384x16384 pixels
Max planar YUV image size 16384x16352 pixels
Max 3D image size 16384x16384x2048 pixels
Max number of read image args 128
Max number of write image args 128
Max number of read/write image args 128
Pipe support No
Max number of pipe args 0
Max active pipe reservations 0
Max pipe packet size 0
Local memory type Local
Local memory size 65536 (64KiB)
Max number of constant args 8
Max constant buffer size 2018633728 (1.88GiB)
Generic address space support No
Max size of kernel argument 2048 (2KiB)
Queue properties (on host)
Out-of-order execution Yes
Profiling Yes
Device enqueue capabilities (n/a)
Queue properties (on device)
Out-of-order execution No
Profiling No
Preferred size 0
Max size 0
Max queues on device 0
Max events on device 0
Prefer user sync for interop Yes
Profiling timer resolution 52ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Non-uniform work-groups Yes
Work-group collective functions No
Sub-group independent forward progress No
IL version SPIR-V_1.3 SPIR-V_1.2 SPIR-V_1.1 SPIR-V_1.0
ILs with version SPIR-V 0x403000 (1.3.0)
SPIR-V 0x402000 (1.2.0)
SPIR-V 0x401000 (1.1.0)
SPIR-V 0x400000 (1.0.0)
SPIR versions 1.2
printf() buffer size 4194304 (4MiB)
Built-in kernels (n/a)
Built-in kernels with version (n/a)
Device Extensions cl_khr_byte_addressable_store cl_khr_device_uuid cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_intel_command_queue_families cl_intel_subgroups cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir cl_intel_accelerator cl_intel_driver_diagnostics cl_khr_priority_hints cl_khr_throttle_hints cl_khr_create_command_queue cl_intel_subgroups_char cl_intel_subgroups_long cl_khr_il_program cl_intel_mem_force_host_memory cl_khr_subgroup_extended_types cl_khr_subgroup_non_uniform_vote cl_khr_subgroup_ballot cl_khr_subgroup_non_uniform_arithmetic cl_khr_subgroup_shuffle cl_khr_subgroup_shuffle_relative cl_khr_subgroup_clustered_reduce cl_intel_device_attribute_query cl_khr_suggested_local_work_size cl_intel_split_work_group_barrier cl_ext_float_atomics cl_khr_external_memory cl_intel_planar_yuv cl_intel_packed_yuv cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_3d_image_writes cl_intel_media_block_io cl_intel_subgroup_local_block_io cl_khr_gl_sharing cl_khr_gl_depth_images cl_khr_gl_event cl_khr_gl_msaa_sharing cl_intel_va_api_media_sharing cl_intel_sharing_format_query cl_khr_pci_bus_info
Device Extensions with Version cl_khr_byte_addressable_store 0x400000 (1.0.0)
cl_khr_device_uuid 0x400000 (1.0.0)
cl_khr_fp16 0x400000 (1.0.0)
cl_khr_global_int32_base_atomics 0x400000 (1.0.0)
cl_khr_global_int32_extended_atomics 0x400000 (1.0.0)
cl_khr_icd 0x400000 (1.0.0)
cl_khr_local_int32_base_atomics 0x400000 (1.0.0)
cl_khr_local_int32_extended_atomics 0x400000 (1.0.0)
cl_intel_command_queue_families 0x400000 (1.0.0)
cl_intel_subgroups 0x400000 (1.0.0)
cl_intel_required_subgroup_size 0x400000 (1.0.0)
cl_intel_subgroups_short 0x400000 (1.0.0)
cl_khr_spir 0x400000 (1.0.0)
cl_intel_accelerator 0x400000 (1.0.0)
cl_intel_driver_diagnostics 0x400000 (1.0.0)
cl_khr_priority_hints 0x400000 (1.0.0)
cl_khr_throttle_hints 0x400000 (1.0.0)
cl_khr_create_command_queue 0x400000 (1.0.0)
cl_intel_subgroups_char 0x400000 (1.0.0)
cl_intel_subgroups_long 0x400000 (1.0.0)
cl_khr_il_program 0x400000 (1.0.0)
cl_intel_mem_force_host_memory 0x400000 (1.0.0)
cl_khr_subgroup_extended_types 0x400000 (1.0.0)
cl_khr_subgroup_non_uniform_vote 0x400000 (1.0.0)
cl_khr_subgroup_ballot 0x400000 (1.0.0)
cl_khr_subgroup_non_uniform_arithmetic 0x400000 (1.0.0)
cl_khr_subgroup_shuffle 0x400000 (1.0.0)
cl_khr_subgroup_shuffle_relative 0x400000 (1.0.0)
cl_khr_subgroup_clustered_reduce 0x400000 (1.0.0)
cl_intel_device_attribute_query 0x400000 (1.0.0)
cl_khr_suggested_local_work_size 0x400000 (1.0.0)
cl_intel_split_work_group_barrier 0x400000 (1.0.0)
cl_ext_float_atomics 0x400000 (1.0.0)
cl_khr_external_memory 0x9001 (0.9.1)
cl_intel_planar_yuv 0x400000 (1.0.0)
cl_intel_packed_yuv 0x400000 (1.0.0)
cl_khr_image2d_from_buffer 0x400000 (1.0.0)
cl_khr_depth_images 0x400000 (1.0.0)
cl_khr_3d_image_writes 0x400000 (1.0.0)
cl_intel_media_block_io 0x400000 (1.0.0)
cl_intel_subgroup_local_block_io 0x400000 (1.0.0)
cl_khr_gl_sharing 0x400000 (1.0.0)
cl_khr_gl_depth_images 0x400000 (1.0.0)
cl_khr_gl_event 0x400000 (1.0.0)
cl_khr_gl_msaa_sharing 0x400000 (1.0.0)
cl_intel_va_api_media_sharing 0x400000 (1.0.0)
cl_intel_sharing_format_query 0x400000 (1.0.0)
cl_khr_pci_bus_info 0x400000 (1.0.0)
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) Intel(R) OpenCL Graphics
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [INTEL]
clCreateContext(NULL, ...) [default] Success [INTEL]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name Intel(R) OpenCL Graphics
Device Name Intel(R) UHD Graphics
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name Intel(R) OpenCL Graphics
Device Name Intel(R) UHD Graphics
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Intel(R) OpenCL Graphics
Device Name Intel(R) UHD Graphics
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.14
ICD loader Profile OpenCL 3.0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, I relayed the clinfo output to the developers.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Prateek, could you add a hello_query_device output for better understanding of your issue? Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is the output from device query:
./hello_query_device
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ]
Illegal instruction (core dumped)
The error log for reference:
tail -f /var/log/kern.log
Oct 9 17:21:12 elkhart kernel: [ 265.763224] audit: type=1400 audit(1728454872.822:67): apparmor="DENIED" operation="capable" profile="/usr/lib/snapd/snap-confine" pid=2032 comm="snap-confine" capability=12 capname="net_admin"
Oct 9 17:21:12 elkhart kernel: [ 265.763233] audit: type=1400 audit(1728454872.822:68): apparmor="DENIED" operation="capable" profile="/usr/lib/snapd/snap-confine" pid=2032 comm="snap-confine" capability=38 capname="perfmon"
Oct 9 17:21:12 elkhart kernel: [ 265.856954] audit: type=1400 audit(1728454872.914:69): apparmor="DENIED" operation="open" profile="snap-update-ns.firefox" name="/usr/local/share/" pid=2053 comm="5" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
Oct 9 17:21:14 elkhart kernel: [ 267.440401] audit: type=1107 audit(1728454874.498:70): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call" bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct 9 17:21:14 elkhart kernel: [ 267.440401] exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct 9 17:21:14 elkhart kernel: [ 267.446291] audit: type=1107 audit(1728454874.506:71): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call" bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct 9 17:21:14 elkhart kernel: [ 267.446291] exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct 9 17:23:51 elkhart kernel: [ 424.217606] audit: type=1107 audit(1728455031.281:72): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call" bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.122" pid=2032 label="snap.firefox.firefox" peer_pid=4039 peer_label="unconfined"
Oct 9 17:23:51 elkhart kernel: [ 424.217606] exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct 9 17:37:31 elkhart kernel: [ 1244.723409] traps: hello_query_dev[6474] trap invalid opcode ip:7f07388c0693 sp:7ffccea047f0 error:0 in libopenvino_intel_npu_plugin.so[7f073889f000+1e8000]
Oct 9 17:52:14 elkhart kernel: [ 2126.953072] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Oct 9 17:52:14 elkhart kernel: [ 2126.953096] i915 0000:00:02.0: [drm] benchmark_app[8047] context reset due to GPU hang
Oct 9 17:52:14 elkhart kernel: [ 2126.958102] i915 0000:00:02.0: [drm] GPU HANG: ecode 11:1:8ed9fff3, in benchmark_app [8047]
Oct 9 17:59:43 elkhart kernel: [ 2576.268271] traps: hello_query_dev[8614] trap invalid opcode ip:7f20d41d9693 sp:7ffdbe5a3170 error:0 in libopenvino_intel_npu_plugin.so[7f20d41b8000+1e8000]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is the hello_query_device:
./hello_query_device
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ]
Illegal instruction (core dumped)
This is the error log:
tail -f /var/log/kern.log
Oct 9 17:21:12 elkhart kernel: [ 265.763224] audit: type=1400 audit(1728454872.822:67): apparmor="DENIED" operation="capable" profile="/usr/lib/snapd/snap-confine" pid=2032 comm="snap-confine" capability=12 capname="net_admin"
Oct 9 17:21:12 elkhart kernel: [ 265.763233] audit: type=1400 audit(1728454872.822:68): apparmor="DENIED" operation="capable" profile="/usr/lib/snapd/snap-confine" pid=2032 comm="snap-confine" capability=38 capname="perfmon"
Oct 9 17:21:12 elkhart kernel: [ 265.856954] audit: type=1400 audit(1728454872.914:69): apparmor="DENIED" operation="open" profile="snap-update-ns.firefox" name="/usr/local/share/" pid=2053 comm="5" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
Oct 9 17:21:14 elkhart kernel: [ 267.440401] audit: type=1107 audit(1728454874.498:70): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call" bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct 9 17:21:14 elkhart kernel: [ 267.440401] exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct 9 17:21:14 elkhart kernel: [ 267.446291] audit: type=1107 audit(1728454874.506:71): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call" bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct 9 17:21:14 elkhart kernel: [ 267.446291] exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct 9 17:23:51 elkhart kernel: [ 424.217606] audit: type=1107 audit(1728455031.281:72): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call" bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.122" pid=2032 label="snap.firefox.firefox" peer_pid=4039 peer_label="unconfined"
Oct 9 17:23:51 elkhart kernel: [ 424.217606] exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct 9 17:37:31 elkhart kernel: [ 1244.723409] traps: hello_query_dev[6474] trap invalid opcode ip:7f07388c0693 sp:7ffccea047f0 error:0 in libopenvino_intel_npu_plugin.so[7f073889f000+1e8000]
Oct 9 17:52:14 elkhart kernel: [ 2126.953072] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Oct 9 17:52:14 elkhart kernel: [ 2126.953096] i915 0000:00:02.0: [drm] benchmark_app[8047] context reset due to GPU hang
Oct 9 17:52:14 elkhart kernel: [ 2126.958102] i915 0000:00:02.0: [drm] GPU HANG: ecode 11:1:8ed9fff3, in benchmark_app [8047]
Oct 9 17:59:43 elkhart kernel: [ 2576.268271] traps: hello_query_dev[8614] trap invalid opcode ip:7f20d41d9693 sp:7ffdbe5a3170 error:0 in libopenvino_intel_npu_plugin.so[7f20d41b8000+1e8000]
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page