Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6506 Discussions

Sync benchmark app crashes on Elkhart lake

PrateekSingh
Novice
5,337 Views

I am experiencing repeated crashes with the Sync Benchmark application on an Elkhart Lake GPU (Works fine on CPU). Below are the details and steps to reproduce the issue, along with some debug logs.

I have changed this sync_benchmark to run from a default duration of 10 seconds to 100 seconds and I get the clWaitForEvents consistently.

Environment:

The Sync Benchmark application consistently hangs and eventually crashes when attempting to run inference on the specified model. The application is executed with the following command:

 

./sync_benchmark /home/test/intel/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.xml GPU

 

Kernel Messages (dmesg): Multiple entries of GPU hangs, for instance:

 

i915 xxxxx [drm] GPU HANG: ecode 11:1:8ed9fff3, in sync_benchmark [4524]

 

I have tried following i915 parameters (note that the error occurs with and without these flags):

 

i915.enable_hangcheck=0 
i915.request_timeout_ms=200000 
intel_idle.max_cstate=1 
i915.enable_dc=0 
ahci.mobile_lpm_policy=1 
i915.enable_psr2_sel_fetch=0 
i915.enable_psr=0 
vt.handoff=7

 

Any insights or solutions to address these crashes would be highly appreciated. I have attached the relevant outputs and configurations for reference.

0 Kudos
39 Replies
Aznie_Intel
Moderator
3,679 Views

Hi PrateekSingh,

 

Thanks for reaching out.

 

Have you configured the GPU with OpenVINO™ before? The Intel® graphics driver must be properly configured on the system. For Ubuntu 22.04 LTS, you have to download and install the deb packages published here and install the apt package ocl-icd-libopencl1 with the OpenCl ICD loader. You may refer to these Configurations for Intel® Processor Graphics (GPU) with OpenVINO™.

 

Meanwhile, there is a similar issue in the discussion below that might help you. You may try the workaround shared by our team:

https://github.com/openvinotoolkit/openvino/issues/9884#issuecomment-1021137278

 

 

Regards,

Aznie


0 Kudos
PrateekSingh
Novice
3,653 Views

Thank you Aznie for your prompt response. Although, I have already done what you have suggested.

 

As highlighted under the environment section of my question, I have used Compute Runtime/OpenCL Version: 24.22.29735.20 (same as you suggested) and have configured the GPU following the steps here. I have also verified that GPU is being used via intel_gpu_top.

I also have disabled the hang check parameter as mentioned in the last section of the question:

 

i915.enable_hangcheck=0 

 

I also have tweaked a few other i915 parameters but they do not affect this. Although it is worth mentioning that enable_hangcheck and request_timeout_ms do delay this error, the error persists nonetheless. Another detail is that I have checked it on a few other hardware (e.g. Apollo Lake) where I can run inference on GPU over long durations as opposed to Elkhart Lake, where the inference starts okay but crashes with clWaitForEvents after some time (differs on every run, but it happens after 10 seconds, that is why I had to modify your sync benchmark app to run for 100 seconds to reproduce this crash).

 

0 Kudos
Aznie_Intel
Moderator
3,602 Views

 

Hi PrateekSingh,

 

I will escalate this to the engineering team and update you once the information is available.

 

 

Regards,

Aznie


0 Kudos
Aznie_Intel
Moderator
3,482 Views

Hi PrateekSingh,

 

The Sync_Benchmark tool is no longer supported by the latest version of OpenVINO. You can use Benchmark_App instead. Please try using Benchmark App with the latest driver and see if it works.

 

 

Regards,

Aznie


0 Kudos
PrateekSingh
Novice
3,406 Views

Hi Aznie,

 

I was able to reproduce the error using Benchmark App as well. This is the output:

./benchmark_app -d GPU -m ~/intel/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.xml -hint none -niter 10000 -nstreams 1
 [Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.2.0-15519-5c0f38f83f6-releases/2024/2
[ INFO ] 
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.2.0-15519-5c0f38f83f6-releases/2024/2
[ INFO ] 
[ INFO ] 
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 46.33 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ]     data (node: data) : f32 / [N,C,H,W] / [1,3,544,992]
[ INFO ]     im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]
[ INFO ] Network outputs:
[ INFO ]     detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ]     data (node: data) : u8 / [N,C,H,W] / [1,3,544,992]
[ INFO ]     im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]
[ INFO ] Network outputs:
[ INFO ]     detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 1516.28 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: PVANet + R-FCN
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   PERF_COUNT: NO
[ INFO ]   ENABLE_CPU_PINNING: NO
[ INFO ]   MODEL_PRIORITY: MEDIUM
[ INFO ]   GPU_HOST_TASK_PRIORITY: MEDIUM
[ INFO ]   GPU_QUEUE_PRIORITY: MEDIUM
[ INFO ]   GPU_QUEUE_THROTTLE: MEDIUM
[ INFO ]   GPU_ENABLE_LOOP_UNROLLING: YES
[ INFO ]   GPU_DISABLE_WINOGRAD_CONVOLUTION: NO
[ INFO ]   CACHE_DIR: 
[ INFO ]   CACHE_MODE: optimize_speed
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ]   COMPILATION_NUM_THREADS: 4
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   INFERENCE_PRECISION_HINT: f16
[ INFO ]   DEVICE_ID: 0
[ INFO ]   EXECUTION_DEVICES: GPU.0
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] data     ([N,C,H,W], u8, [1,3,544,992], static):	random (image/numpy array is expected)
[ INFO ] im_info  ([H,W], f32, [1,6], static):	random (binary data/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests using 1 streams for GPU, limits: 10000 iterations)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 194.28 ms
[ ERROR ] Exception from src/plugins/intel_gpu/src/runtime/ocl/ocl_event.cpp:56:
[GPU] clWaitForEvents, error code: -14
0 Kudos
Aznie_Intel
Moderator
3,360 Views

Hi PrateekSingh,

 

Thanks for sharing your results. We are checking on this and will get back to you soon.

 

 

Regards,

Aznie


0 Kudos
PrateekSingh
Novice
3,243 Views

Following up on this.

0 Kudos
Hari_B_Intel
Moderator
2,901 Views

Hi PrateekSingh


Sorry for taking some time on this case as we are constantly validating and figuring out the issue's root cause.

Which Series of Elkhart lake are you currently using?


you can find out by using this command 'lscpu'


As far as our concern, certain series of Elkhart lake might not supported.


Hope to hear from you soon.


Thank you


0 Kudos
PrateekSingh
Novice
2,791 Views

Thanks for looking into this. This is the lscpu output:

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   39 bits physical, 48 bits virtual
CPU(s):                          4
On-line CPU(s) list:             0-3
Thread(s) per core:              1
Core(s) per socket:              4
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           150
Model name:                      Intel(R) Celeron(R) J6412 @ 2.00GHz
Stepping:                        1
Frequency boost:                 enabled
CPU MHz:                         800.000
CPU max MHz:                     2001.0000
CPU min MHz:                     800.0000
BogoMIPS:                        3993.60
Virtualisation:                  VT-x
L1d cache:                       128 KiB
L1i cache:                       128 KiB
L2 cache:                        1.5 MiB
L3 cache:                        4 MiB
NUMA node0 CPU(s):               0-3
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Mmio stale data:   Mitigation; Clear CPU buffers; SMT disabled
Vulnerability Retbleed:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled v
                                 ia prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user
                                  pointer sanitization
Vulnerability Spectre v2:        Mitigation; Enhanced IBRS, IBPB conditional, RS
                                 B filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:             Vulnerable: No microcode
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtr
                                 r pge mca cmov pat pse36 clflush dts acpi mmx f
                                 xsr sse sse2 ss ht tm pbe syscall nx rdtscp lm 
                                 constant_tsc art arch_perfmon pebs bts rep_good
                                  nopl xtopology nonstop_tsc cpuid aperfmperf ts
                                 c_known_freq pni pclmulqdq dtes64 monitor ds_cp
                                 l vmx est tm2 ssse3 sdbg cx16 xtpr pdcm sse4_1 
                                 sse4_2 x2apic movbe popcnt tsc_deadline_timer a
                                 es xsave rdrand lahf_lm 3dnowprefetch cpuid_fau
                                 lt epb cat_l2 cdp_l2 ssbd ibrs ibpb stibp ibrs_
                                 enhanced tpr_shadow vnmi flexpriority ept vpid 
                                 ept_ad fsgsbase tsc_adjust smep erms rdt_a rdse
                                 ed smap clflushopt clwb intel_pt sha_ni xsaveop
                                 t xsavec xgetbv1 xsaves split_lock_detect dther
                                 m ida arat pln pts umip waitpkg gfni rdpid movd
                                 iri movdir64b md_clear flush_l1d arch_capabilit
                                 ies
0 Kudos
Witold_Intel
Employee
2,768 Views

Thank you for the lscpu output. We will analyze it and get back to you.


0 Kudos
Witold_Intel
Employee
2,678 Views

I noticed you're not using the latest version of compute runtime driver. Do you have an option to try it? https://github.com/intel/compute-runtime/releases/tag/24.35.30872.22


0 Kudos
PrateekSingh
Novice
2,575 Views

Yes, I will get back to you after trying this latest release, I hope the issue is fixed in this version.

However I noticed that Elkhart Lake is marked for Legacy Quality in 24.35.30872.22  release, whereas in the previous release, 24.22.29735.20 (The last release, which is what I have used) Elkhart has a Production Quality.

0 Kudos
Witold_Intel
Employee
2,474 Views

Understood, thank you for your feedback. If another version of compute driver does not help, I may have to consult this case with developers.


0 Kudos
Witold_Intel
Employee
2,385 Views

Thanks for your patience, Prateek. Developers are asking for clinfo results and output of hello_query_device in order to understand your case better. Could you run those commands and post the output here?


0 Kudos
Witold_Intel
Employee
2,294 Views

Hello Prateek, we haven't received a reply from you for 3 business days. If we don't receive the information we asked for for 4 more business days, we will have to close this case. Thank you for your understanding.


0 Kudos
PrateekSingh
Novice
2,200 Views

Hi,

 

This is the clinfo output, I'll add the device query output on Monday.

clinfo 
Number of platforms                               1
  Platform Name                                   Intel(R) OpenCL Graphics
  Platform Vendor                                 Intel(R) Corporation
  Platform Version                                OpenCL 3.0 
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_byte_addressable_store cl_khr_device_uuid cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_intel_command_queue_families cl_intel_subgroups cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir cl_intel_accelerator cl_intel_driver_diagnostics cl_khr_priority_hints cl_khr_throttle_hints cl_khr_create_command_queue cl_intel_subgroups_char cl_intel_subgroups_long cl_khr_il_program cl_intel_mem_force_host_memory cl_khr_subgroup_extended_types cl_khr_subgroup_non_uniform_vote cl_khr_subgroup_ballot cl_khr_subgroup_non_uniform_arithmetic cl_khr_subgroup_shuffle cl_khr_subgroup_shuffle_relative cl_khr_subgroup_clustered_reduce cl_intel_device_attribute_query cl_khr_suggested_local_work_size cl_intel_split_work_group_barrier cl_ext_float_atomics cl_khr_external_memory cl_intel_planar_yuv cl_intel_packed_yuv cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_3d_image_writes cl_intel_media_block_io cl_intel_subgroup_local_block_io cl_khr_gl_sharing cl_khr_gl_depth_images cl_khr_gl_event cl_khr_gl_msaa_sharing cl_intel_va_api_media_sharing cl_intel_sharing_format_query cl_khr_pci_bus_info 
  Platform Extensions with Version                cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_khr_device_uuid                                               0x400000 (1.0.0)
                                                  cl_khr_fp16                                                      0x400000 (1.0.0)
                                                  cl_khr_global_int32_base_atomics                                 0x400000 (1.0.0)
                                                  cl_khr_global_int32_extended_atomics                             0x400000 (1.0.0)
                                                  cl_khr_icd                                                       0x400000 (1.0.0)
                                                  cl_khr_local_int32_base_atomics                                  0x400000 (1.0.0)
                                                  cl_khr_local_int32_extended_atomics                              0x400000 (1.0.0)
                                                  cl_intel_command_queue_families                                  0x400000 (1.0.0)
                                                  cl_intel_subgroups                                               0x400000 (1.0.0)
                                                  cl_intel_required_subgroup_size                                  0x400000 (1.0.0)
                                                  cl_intel_subgroups_short                                         0x400000 (1.0.0)
                                                  cl_khr_spir                                                      0x400000 (1.0.0)
                                                  cl_intel_accelerator                                             0x400000 (1.0.0)
                                                  cl_intel_driver_diagnostics                                      0x400000 (1.0.0)
                                                  cl_khr_priority_hints                                            0x400000 (1.0.0)
                                                  cl_khr_throttle_hints                                            0x400000 (1.0.0)
                                                  cl_khr_create_command_queue                                      0x400000 (1.0.0)
                                                  cl_intel_subgroups_char                                          0x400000 (1.0.0)
                                                  cl_intel_subgroups_long                                          0x400000 (1.0.0)
                                                  cl_khr_il_program                                                0x400000 (1.0.0)
                                                  cl_intel_mem_force_host_memory                                   0x400000 (1.0.0)
                                                  cl_khr_subgroup_extended_types                                   0x400000 (1.0.0)
                                                  cl_khr_subgroup_non_uniform_vote                                 0x400000 (1.0.0)
                                                  cl_khr_subgroup_ballot                                           0x400000 (1.0.0)
                                                  cl_khr_subgroup_non_uniform_arithmetic                           0x400000 (1.0.0)
                                                  cl_khr_subgroup_shuffle                                          0x400000 (1.0.0)
                                                  cl_khr_subgroup_shuffle_relative                                 0x400000 (1.0.0)
                                                  cl_khr_subgroup_clustered_reduce                                 0x400000 (1.0.0)
                                                  cl_intel_device_attribute_query                                  0x400000 (1.0.0)
                                                  cl_khr_suggested_local_work_size                                 0x400000 (1.0.0)
                                                  cl_intel_split_work_group_barrier                                0x400000 (1.0.0)
                                                  cl_ext_float_atomics                                             0x400000 (1.0.0)
                                                  cl_khr_external_memory                                             0x9001 (0.9.1)
                                                  cl_intel_planar_yuv                                              0x400000 (1.0.0)
                                                  cl_intel_packed_yuv                                              0x400000 (1.0.0)
                                                  cl_khr_image2d_from_buffer                                       0x400000 (1.0.0)
                                                  cl_khr_depth_images                                              0x400000 (1.0.0)
                                                  cl_khr_3d_image_writes                                           0x400000 (1.0.0)
                                                  cl_intel_media_block_io                                          0x400000 (1.0.0)
                                                  cl_intel_subgroup_local_block_io                                 0x400000 (1.0.0)
                                                  cl_khr_gl_sharing                                                0x400000 (1.0.0)
                                                  cl_khr_gl_depth_images                                           0x400000 (1.0.0)
                                                  cl_khr_gl_event                                                  0x400000 (1.0.0)
                                                  cl_khr_gl_msaa_sharing                                           0x400000 (1.0.0)
                                                  cl_intel_va_api_media_sharing                                    0x400000 (1.0.0)
                                                  cl_intel_sharing_format_query                                    0x400000 (1.0.0)
                                                  cl_khr_pci_bus_info                                              0x400000 (1.0.0)
  Platform Numeric Version                        0xc00000 (3.0.0)
  Platform Extensions function suffix             INTEL
  Platform Host timer resolution                  1ns

  Platform Name                                   Intel(R) OpenCL Graphics
Number of devices                                 1
  Device Name                                     Intel(R) UHD Graphics
  Device Vendor                                   Intel(R) Corporation
  Device Vendor ID                                0x8086
  Device Version                                  OpenCL 3.0 NEO 
  Device UUID                                     86805545-0100-0000-0002-000000000000
  Driver UUID                                     32342e32-322e-3239-3733-352e32300000
  Valid Device LUID                               No
  Device LUID                                     e049-7c2dfe7f0000
  Device Node Mask                                0
  Device Numeric Version                          0xc00000 (3.0.0)
  Driver Version                                  24.22.29735.20
  Device OpenCL C Version                         OpenCL C 1.2 
  Device OpenCL C all versions                    OpenCL C                                                         0x400000 (1.0.0)
                                                  OpenCL C                                                         0x401000 (1.1.0)
                                                  OpenCL C                                                         0x402000 (1.2.0)
                                                  OpenCL C                                                         0xc00000 (3.0.0)
  Device OpenCL C features                        __opencl_c_int64                                                 0xc00000 (3.0.0)
                                                  __opencl_c_3d_image_writes                                       0xc00000 (3.0.0)
                                                  __opencl_c_images                                                0xc00000 (3.0.0)
                                                  __opencl_c_read_write_images                                     0xc00000 (3.0.0)
  Latest comfornace test passed                   v2024-02-27-00
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               16
  Max clock frequency                             800MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple (device)     32
  Preferred work group size multiple (kernel)     32
  Max sub-groups per work group                   0
  Sub-group sizes (Intel)                         8, 16, 32
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 1 / 1       
    half                                                 8 / 8        (cl_khr_fp16)
    float                                                1 / 1       
    double                                               0 / 0        (n/a)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (n/a)
  Address bits                                    32, Little-Endian
  Global memory size                              4037267456 (3.76GiB)
  Error Correction support                        No
  Max memory allocation                           2018633728 (1.88GiB)
  Unified memory for Host and Device              Yes
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 No
    Fine-grained buffer sharing                   No
    Fine-grained system sharing                   No
    Atomics                                       No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Preferred alignment for atomics                 
    SVM                                           64 bytes
    Global                                        64 bytes
    Local                                         64 bytes
  Atomic memory capabilities                      relaxed, work-group scope
  Atomic fence capabilities                       relaxed, acquire/release, work-group scope
  Max size for global variable                    0
  Preferred total size of global vars             0
  Global Memory cache type                        Read/Write
  Global Memory cache size                        1310720 (1.25MiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            126164608 pixels
    Max 1D or 2D image array size                 2048 images
    Base address alignment for 2D image buffers   4 bytes
    Pitch alignment for 2D image buffers          4 pixels
    Max 2D image size                             16384x16384 pixels
    Max planar YUV image size                     16384x16352 pixels
    Max 3D image size                             16384x16384x2048 pixels
    Max number of read image args                 128
    Max number of write image args                128
    Max number of read/write image args           128
  Pipe support                                    No
  Max number of pipe args                         0
  Max active pipe reservations                    0
  Max pipe packet size                            0
  Local memory type                               Local
  Local memory size                               65536 (64KiB)
  Max number of constant args                     8
  Max constant buffer size                        2018633728 (1.88GiB)
  Generic address space support                   No
  Max size of kernel argument                     2048 (2KiB)
  Queue properties (on host)                      
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Device enqueue capabilities                     (n/a)
  Queue properties (on device)                    
    Out-of-order execution                        No
    Profiling                                     No
    Preferred size                                0
    Max size                                      0
  Max queues on device                            0
  Max events on device                            0
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      52ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Non-uniform work-groups                       Yes
    Work-group collective functions               No
    Sub-group independent forward progress        No
    IL version                                    SPIR-V_1.3 SPIR-V_1.2 SPIR-V_1.1 SPIR-V_1.0 
    ILs with version                              SPIR-V                                                           0x403000 (1.3.0)
                                                  SPIR-V                                                           0x402000 (1.2.0)
                                                  SPIR-V                                                           0x401000 (1.1.0)
                                                  SPIR-V                                                           0x400000 (1.0.0)
    SPIR versions                                 1.2 
  printf() buffer size                            4194304 (4MiB)
  Built-in kernels                                (n/a)
  Built-in kernels with version                   (n/a)
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_device_uuid cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_intel_command_queue_families cl_intel_subgroups cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir cl_intel_accelerator cl_intel_driver_diagnostics cl_khr_priority_hints cl_khr_throttle_hints cl_khr_create_command_queue cl_intel_subgroups_char cl_intel_subgroups_long cl_khr_il_program cl_intel_mem_force_host_memory cl_khr_subgroup_extended_types cl_khr_subgroup_non_uniform_vote cl_khr_subgroup_ballot cl_khr_subgroup_non_uniform_arithmetic cl_khr_subgroup_shuffle cl_khr_subgroup_shuffle_relative cl_khr_subgroup_clustered_reduce cl_intel_device_attribute_query cl_khr_suggested_local_work_size cl_intel_split_work_group_barrier cl_ext_float_atomics cl_khr_external_memory cl_intel_planar_yuv cl_intel_packed_yuv cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_3d_image_writes cl_intel_media_block_io cl_intel_subgroup_local_block_io cl_khr_gl_sharing cl_khr_gl_depth_images cl_khr_gl_event cl_khr_gl_msaa_sharing cl_intel_va_api_media_sharing cl_intel_sharing_format_query cl_khr_pci_bus_info 
  Device Extensions with Version                  cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_khr_device_uuid                                               0x400000 (1.0.0)
                                                  cl_khr_fp16                                                      0x400000 (1.0.0)
                                                  cl_khr_global_int32_base_atomics                                 0x400000 (1.0.0)
                                                  cl_khr_global_int32_extended_atomics                             0x400000 (1.0.0)
                                                  cl_khr_icd                                                       0x400000 (1.0.0)
                                                  cl_khr_local_int32_base_atomics                                  0x400000 (1.0.0)
                                                  cl_khr_local_int32_extended_atomics                              0x400000 (1.0.0)
                                                  cl_intel_command_queue_families                                  0x400000 (1.0.0)
                                                  cl_intel_subgroups                                               0x400000 (1.0.0)
                                                  cl_intel_required_subgroup_size                                  0x400000 (1.0.0)
                                                  cl_intel_subgroups_short                                         0x400000 (1.0.0)
                                                  cl_khr_spir                                                      0x400000 (1.0.0)
                                                  cl_intel_accelerator                                             0x400000 (1.0.0)
                                                  cl_intel_driver_diagnostics                                      0x400000 (1.0.0)
                                                  cl_khr_priority_hints                                            0x400000 (1.0.0)
                                                  cl_khr_throttle_hints                                            0x400000 (1.0.0)
                                                  cl_khr_create_command_queue                                      0x400000 (1.0.0)
                                                  cl_intel_subgroups_char                                          0x400000 (1.0.0)
                                                  cl_intel_subgroups_long                                          0x400000 (1.0.0)
                                                  cl_khr_il_program                                                0x400000 (1.0.0)
                                                  cl_intel_mem_force_host_memory                                   0x400000 (1.0.0)
                                                  cl_khr_subgroup_extended_types                                   0x400000 (1.0.0)
                                                  cl_khr_subgroup_non_uniform_vote                                 0x400000 (1.0.0)
                                                  cl_khr_subgroup_ballot                                           0x400000 (1.0.0)
                                                  cl_khr_subgroup_non_uniform_arithmetic                           0x400000 (1.0.0)
                                                  cl_khr_subgroup_shuffle                                          0x400000 (1.0.0)
                                                  cl_khr_subgroup_shuffle_relative                                 0x400000 (1.0.0)
                                                  cl_khr_subgroup_clustered_reduce                                 0x400000 (1.0.0)
                                                  cl_intel_device_attribute_query                                  0x400000 (1.0.0)
                                                  cl_khr_suggested_local_work_size                                 0x400000 (1.0.0)
                                                  cl_intel_split_work_group_barrier                                0x400000 (1.0.0)
                                                  cl_ext_float_atomics                                             0x400000 (1.0.0)
                                                  cl_khr_external_memory                                             0x9001 (0.9.1)
                                                  cl_intel_planar_yuv                                              0x400000 (1.0.0)
                                                  cl_intel_packed_yuv                                              0x400000 (1.0.0)
                                                  cl_khr_image2d_from_buffer                                       0x400000 (1.0.0)
                                                  cl_khr_depth_images                                              0x400000 (1.0.0)
                                                  cl_khr_3d_image_writes                                           0x400000 (1.0.0)
                                                  cl_intel_media_block_io                                          0x400000 (1.0.0)
                                                  cl_intel_subgroup_local_block_io                                 0x400000 (1.0.0)
                                                  cl_khr_gl_sharing                                                0x400000 (1.0.0)
                                                  cl_khr_gl_depth_images                                           0x400000 (1.0.0)
                                                  cl_khr_gl_event                                                  0x400000 (1.0.0)
                                                  cl_khr_gl_msaa_sharing                                           0x400000 (1.0.0)
                                                  cl_intel_va_api_media_sharing                                    0x400000 (1.0.0)
                                                  cl_intel_sharing_format_query                                    0x400000 (1.0.0)
                                                  cl_khr_pci_bus_info                                              0x400000 (1.0.0)

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Intel(R) OpenCL Graphics
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [INTEL]
  clCreateContext(NULL, ...) [default]            Success [INTEL]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 Intel(R) OpenCL Graphics
    Device Name                                   Intel(R) UHD Graphics
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 Intel(R) OpenCL Graphics
    Device Name                                   Intel(R) UHD Graphics
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Intel(R) OpenCL Graphics
    Device Name                                   Intel(R) UHD Graphics

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.14
  ICD loader Profile                              OpenCL 3.0

 

0 Kudos
Witold_Intel
Employee
1,973 Views

Thank you, I relayed the clinfo output to the developers.


0 Kudos
Witold_Intel
Employee
1,869 Views

Hello Prateek, could you add a hello_query_device output for better understanding of your issue? Thanks.


0 Kudos
PrateekSingh
Novice
1,729 Views

This is the output from device query:

./hello_query_device 
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] 
Illegal instruction (core dumped)

The error log for reference:

tail -f /var/log/kern.log 
Oct  9 17:21:12 elkhart kernel: [  265.763224] audit: type=1400 audit(1728454872.822:67): apparmor="DENIED" operation="capable" profile="/usr/lib/snapd/snap-confine" pid=2032 comm="snap-confine" capability=12  capname="net_admin"
Oct  9 17:21:12 elkhart kernel: [  265.763233] audit: type=1400 audit(1728454872.822:68): apparmor="DENIED" operation="capable" profile="/usr/lib/snapd/snap-confine" pid=2032 comm="snap-confine" capability=38  capname="perfmon"
Oct  9 17:21:12 elkhart kernel: [  265.856954] audit: type=1400 audit(1728454872.914:69): apparmor="DENIED" operation="open" profile="snap-update-ns.firefox" name="/usr/local/share/" pid=2053 comm="5" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
Oct  9 17:21:14 elkhart kernel: [  267.440401] audit: type=1107 audit(1728454874.498:70): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct  9 17:21:14 elkhart kernel: [  267.440401]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:21:14 elkhart kernel: [  267.446291] audit: type=1107 audit(1728454874.506:71): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct  9 17:21:14 elkhart kernel: [  267.446291]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:23:51 elkhart kernel: [  424.217606] audit: type=1107 audit(1728455031.281:72): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.122" pid=2032 label="snap.firefox.firefox" peer_pid=4039 peer_label="unconfined"
Oct  9 17:23:51 elkhart kernel: [  424.217606]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:37:31 elkhart kernel: [ 1244.723409] traps: hello_query_dev[6474] trap invalid opcode ip:7f07388c0693 sp:7ffccea047f0 error:0 in libopenvino_intel_npu_plugin.so[7f073889f000+1e8000]
Oct  9 17:52:14 elkhart kernel: [ 2126.953072] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Oct  9 17:52:14 elkhart kernel: [ 2126.953096] i915 0000:00:02.0: [drm] benchmark_app[8047] context reset due to GPU hang
Oct  9 17:52:14 elkhart kernel: [ 2126.958102] i915 0000:00:02.0: [drm] GPU HANG: ecode 11:1:8ed9fff3, in benchmark_app [8047]
Oct  9 17:59:43 elkhart kernel: [ 2576.268271] traps: hello_query_dev[8614] trap invalid opcode ip:7f20d41d9693 sp:7ffdbe5a3170 error:0 in libopenvino_intel_npu_plugin.so[7f20d41b8000+1e8000]

 

0 Kudos
PrateekSingh
Novice
1,729 Views

This is the hello_query_device:

./hello_query_device 
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] 
Illegal instruction (core dumped)

This is the error log:

tail -f /var/log/kern.log 
Oct  9 17:21:12 elkhart kernel: [  265.763224] audit: type=1400 audit(1728454872.822:67): apparmor="DENIED" operation="capable" profile="/usr/lib/snapd/snap-confine" pid=2032 comm="snap-confine" capability=12  capname="net_admin"
Oct  9 17:21:12 elkhart kernel: [  265.763233] audit: type=1400 audit(1728454872.822:68): apparmor="DENIED" operation="capable" profile="/usr/lib/snapd/snap-confine" pid=2032 comm="snap-confine" capability=38  capname="perfmon"
Oct  9 17:21:12 elkhart kernel: [  265.856954] audit: type=1400 audit(1728454872.914:69): apparmor="DENIED" operation="open" profile="snap-update-ns.firefox" name="/usr/local/share/" pid=2053 comm="5" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
Oct  9 17:21:14 elkhart kernel: [  267.440401] audit: type=1107 audit(1728454874.498:70): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct  9 17:21:14 elkhart kernel: [  267.440401]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:21:14 elkhart kernel: [  267.446291] audit: type=1107 audit(1728454874.506:71): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct  9 17:21:14 elkhart kernel: [  267.446291]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:23:51 elkhart kernel: [  424.217606] audit: type=1107 audit(1728455031.281:72): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.122" pid=2032 label="snap.firefox.firefox" peer_pid=4039 peer_label="unconfined"
Oct  9 17:23:51 elkhart kernel: [  424.217606]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:37:31 elkhart kernel: [ 1244.723409] traps: hello_query_dev[6474] trap invalid opcode ip:7f07388c0693 sp:7ffccea047f0 error:0 in libopenvino_intel_npu_plugin.so[7f073889f000+1e8000]
Oct  9 17:52:14 elkhart kernel: [ 2126.953072] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Oct  9 17:52:14 elkhart kernel: [ 2126.953096] i915 0000:00:02.0: [drm] benchmark_app[8047] context reset due to GPU hang
Oct  9 17:52:14 elkhart kernel: [ 2126.958102] i915 0000:00:02.0: [drm] GPU HANG: ecode 11:1:8ed9fff3, in benchmark_app [8047]
Oct  9 17:59:43 elkhart kernel: [ 2576.268271] traps: hello_query_dev[8614] trap invalid opcode ip:7f20d41d9693 sp:7ffdbe5a3170 error:0 in libopenvino_intel_npu_plugin.so[7f20d41b8000+1e8000]

 

0 Kudos
Reply