Re: Sync benchmark app crashes on Elkhart lake - Page 2

PrateekSingh · ‎07-16-2024

I am experiencing repeated crashes with the Sync Benchmark application on an Elkhart Lake GPU (Works fine on CPU). Below are the details and steps to reproduce the issue, along with some debug logs.

I have changed this sync_benchmark to run from a default duration of 10 seconds to 100 seconds and I get the clWaitForEvents consistently.

Environment:

OS: Ubuntu 22.04
OpenVINO Version: 2024.2
Compute Runtime/OpenCL Version: 24.22.29735.20
Model Used: Person Detection Retail-0002

The Sync Benchmark application consistently hangs and eventually crashes when attempting to run inference on the specified model. The application is executed with the following command:

./sync_benchmark /home/test/intel/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.xml GPU

Kernel Messages (dmesg): Multiple entries of GPU hangs, for instance:

i915 xxxxx [drm] GPU HANG: ecode 11:1:8ed9fff3, in sync_benchmark [4524]

I have tried following i915 parameters (note that the error occurs with and without these flags):

i915.enable_hangcheck=0 
i915.request_timeout_ms=200000 
intel_idle.max_cstate=1 
i915.enable_dc=0 
ahci.mobile_lpm_policy=1 
i915.enable_psr2_sel_fetch=0 
i915.enable_psr=0 
vt.handoff=7

Any insights or solutions to address these crashes would be highly appreciated. I have attached the relevant outputs and configurations for reference.

PrateekSingh · ‎10-09-2024

@Witold_Intel wrote:
Hello Prateek, could you add a hello_query_device output for better understanding of your issue? Thanks.

I retried this with the latest openvino version (openvino-2024.4.0) and compute runtime driver (https://github.com/intel/compute-runtime/releases/tag/24.35.30872.22), the error persists:

./benchmark_app -d GPU -m ~/intel/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.xml -hint none -niter 10000 -nstreams 1
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] 
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] 
[ INFO ] 
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 52.17 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ]     data (node: data) : f32 / [N,C,H,W] / [1,3,544,992]
[ INFO ]     im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]
[ INFO ] Network outputs:
[ INFO ]     detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ]     data (node: data) : u8 / [N,C,H,W] / [1,3,544,992]
[ INFO ]     im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]
[ INFO ] Network outputs:
[ INFO ]     detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 1405.16 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: PVANet + R-FCN
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   PERF_COUNT: NO
[ INFO ]   ENABLE_CPU_PINNING: NO
[ INFO ]   MODEL_PRIORITY: MEDIUM
[ INFO ]   GPU_HOST_TASK_PRIORITY: MEDIUM
[ INFO ]   GPU_QUEUE_PRIORITY: MEDIUM
[ INFO ]   GPU_QUEUE_THROTTLE: MEDIUM
[ INFO ]   GPU_ENABLE_LOOP_UNROLLING: YES
[ INFO ]   GPU_DISABLE_WINOGRAD_CONVOLUTION: NO
[ INFO ]   CACHE_DIR: 
[ INFO ]   CACHE_MODE: optimize_speed
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ]   COMPILATION_NUM_THREADS: 4
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   INFERENCE_PRECISION_HINT: f16
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 0
[ INFO ]   DEVICE_ID: 0
[ INFO ]   EXECUTION_DEVICES: GPU.0
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] data     ([N,C,H,W], u8, [1,3,544,992], static):	random (image/numpy array is expected)
[ INFO ] im_info  ([H,W], f32, [1,6], static):	random (binary data/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests using 1 streams for GPU, limits: 10000 iterations)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 167.01 ms
[ ERROR ] Exception from src/plugins/intel_gpu/src/runtime/ocl/ocl_event.cpp:56:
[GPU] clWaitForEvents, error code: -14

This is the device query output:

test@elkhart:~/openvino_cpp_samples_build/intel64/Release$ ./hello_query_device 
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] 
Illegal instruction (core dumped)

The error log:

Oct  9 17:21:14 elkhart kernel: [  267.440401] audit: type=1107 audit(1728454874.498:70): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct  9 17:21:14 elkhart kernel: [  267.440401]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:21:14 elkhart kernel: [  267.446291] audit: type=1107 audit(1728454874.506:71): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct  9 17:21:14 elkhart kernel: [  267.446291]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:23:51 elkhart kernel: [  424.217606] audit: type=1107 audit(1728455031.281:72): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.122" pid=2032 label="snap.firefox.firefox" peer_pid=4039 peer_label="unconfined"
Oct  9 17:23:51 elkhart kernel: [  424.217606]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:37:31 elkhart kernel: [ 1244.723409] traps: hello_query_dev[6474] trap invalid opcode ip:7f07388c0693 sp:7ffccea047f0 error:0 in libopenvino_intel_npu_plugin.so[7f073889f000+1e8000]
Oct  9 17:52:14 elkhart kernel: [ 2126.953072] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Oct  9 17:52:14 elkhart kernel: [ 2126.953096] i915 0000:00:02.0: [drm] benchmark_app[8047] context reset due to GPU hang
Oct  9 17:52:14 elkhart kernel: [ 2126.958102] i915 0000:00:02.0: [drm] GPU HANG: ecode 11:1:8ed9fff3, in benchmark_app [8047]
Oct  9 17:59:43 elkhart kernel: [ 2576.268271] traps: hello_query_dev[8614] trap invalid opcode ip:7f20d41d9693 sp:7ffdbe5a3170 error:0 in libopenvino_intel_npu_plugin.so[7f20d41b8000+1e8000]
Oct  9 18:00:06 elkhart kernel: [ 2599.234876] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Oct  9 18:00:06 elkhart kernel: [ 2599.234902] i915 0000:00:02.0: [drm] benchmark_app[8583] context reset due to GPU hang
Oct  9 18:00:06 elkhart kernel: [ 2599.239341] i915 0000:00:02.0: [drm] GPU HANG: ecode 11:1:85dcfffa, in benchmark_app [8583]

PrateekSingh · ‎10-09-2024

I retried this with the latest openvino version (openvino-2024.4.0) and compute runtime driver (https://github.com/intel/compute-runtime/releases/tag/24.35.30872.22), the error persists:

./benchmark_app -d GPU -m ~/intel/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.xml -hint none -niter 10000 -nstreams 1
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] 
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] 
[ INFO ] 
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 52.17 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ]     data (node: data) : f32 / [N,C,H,W] / [1,3,544,992]
[ INFO ]     im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]
[ INFO ] Network outputs:
[ INFO ]     detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ]     data (node: data) : u8 / [N,C,H,W] / [1,3,544,992]
[ INFO ]     im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]
[ INFO ] Network outputs:
[ INFO ]     detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 1405.16 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: PVANet + R-FCN
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   PERF_COUNT: NO
[ INFO ]   ENABLE_CPU_PINNING: NO
[ INFO ]   MODEL_PRIORITY: MEDIUM
[ INFO ]   GPU_HOST_TASK_PRIORITY: MEDIUM
[ INFO ]   GPU_QUEUE_PRIORITY: MEDIUM
[ INFO ]   GPU_QUEUE_THROTTLE: MEDIUM
[ INFO ]   GPU_ENABLE_LOOP_UNROLLING: YES
[ INFO ]   GPU_DISABLE_WINOGRAD_CONVOLUTION: NO
[ INFO ]   CACHE_DIR: 
[ INFO ]   CACHE_MODE: optimize_speed
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ]   COMPILATION_NUM_THREADS: 4
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   INFERENCE_PRECISION_HINT: f16
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 0
[ INFO ]   DEVICE_ID: 0
[ INFO ]   EXECUTION_DEVICES: GPU.0
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] data     ([N,C,H,W], u8, [1,3,544,992], static):	random (image/numpy array is expected)
[ INFO ] im_info  ([H,W], f32, [1,6], static):	random (binary data/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests using 1 streams for GPU, limits: 10000 iterations)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 167.01 ms
[ ERROR ] Exception from src/plugins/intel_gpu/src/runtime/ocl/ocl_event.cpp:56:
[GPU] clWaitForEvents, error code: -14

This is the device query output:

test@elkhart:~/openvino_cpp_samples_build/intel64/Release$ ./hello_query_device 
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] 
Illegal instruction (core dumped)

The error log:

Oct  9 17:21:14 elkhart kernel: [  267.440401] audit: type=1107 audit(1728454874.498:70): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct  9 17:21:14 elkhart kernel: [  267.440401]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:21:14 elkhart kernel: [  267.446291] audit: type=1107 audit(1728454874.506:71): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct  9 17:21:14 elkhart kernel: [  267.446291]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:23:51 elkhart kernel: [  424.217606] audit: type=1107 audit(1728455031.281:72): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.122" pid=2032 label="snap.firefox.firefox" peer_pid=4039 peer_label="unconfined"
Oct  9 17:23:51 elkhart kernel: [  424.217606]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:37:31 elkhart kernel: [ 1244.723409] traps: hello_query_dev[6474] trap invalid opcode ip:7f07388c0693 sp:7ffccea047f0 error:0 in libopenvino_intel_npu_plugin.so[7f073889f000+1e8000]
Oct  9 17:52:14 elkhart kernel: [ 2126.953072] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Oct  9 17:52:14 elkhart kernel: [ 2126.953096] i915 0000:00:02.0: [drm] benchmark_app[8047] context reset due to GPU hang
Oct  9 17:52:14 elkhart kernel: [ 2126.958102] i915 0000:00:02.0: [drm] GPU HANG: ecode 11:1:8ed9fff3, in benchmark_app [8047]
Oct  9 17:59:43 elkhart kernel: [ 2576.268271] traps: hello_query_dev[8614] trap invalid opcode ip:7f20d41d9693 sp:7ffdbe5a3170 error:0 in libopenvino_intel_npu_plugin.so[7f20d41b8000+1e8000]
Oct  9 18:00:06 elkhart kernel: [ 2599.234876] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Oct  9 18:00:06 elkhart kernel: [ 2599.234902] i915 0000:00:02.0: [drm] benchmark_app[8583] context reset due to GPU hang
Oct  9 18:00:06 elkhart kernel: [ 2599.239341] i915 0000:00:02.0: [drm] GPU HANG: ecode 11:1:85dcfffa, in benchmark_app [8583]

PrateekSingh · ‎10-09-2024

I retried this with the latest openvino version (openvino-2024.4.0) and compute runtime driver (https://github.com/intel/compute-runtime/releases/tag/24.35.30872.22), the error persists:

./benchmark_app -d GPU -m ~/intel/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.xml -hint none -niter 10000 -nstreams 1
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] 
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] 
[ INFO ] 
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 52.17 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ]     data (node: data) : f32 / [N,C,H,W] / [1,3,544,992]
[ INFO ]     im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]
[ INFO ] Network outputs:
[ INFO ]     detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ]     data (node: data) : u8 / [N,C,H,W] / [1,3,544,992]
[ INFO ]     im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]
[ INFO ] Network outputs:
[ INFO ]     detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 1405.16 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: PVANet + R-FCN
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   PERF_COUNT: NO
[ INFO ]   ENABLE_CPU_PINNING: NO
[ INFO ]   MODEL_PRIORITY: MEDIUM
[ INFO ]   GPU_HOST_TASK_PRIORITY: MEDIUM
[ INFO ]   GPU_QUEUE_PRIORITY: MEDIUM
[ INFO ]   GPU_QUEUE_THROTTLE: MEDIUM
[ INFO ]   GPU_ENABLE_LOOP_UNROLLING: YES
[ INFO ]   GPU_DISABLE_WINOGRAD_CONVOLUTION: NO
[ INFO ]   CACHE_DIR: 
[ INFO ]   CACHE_MODE: optimize_speed
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ]   COMPILATION_NUM_THREADS: 4
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   INFERENCE_PRECISION_HINT: f16
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 0
[ INFO ]   DEVICE_ID: 0
[ INFO ]   EXECUTION_DEVICES: GPU.0
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] data     ([N,C,H,W], u8, [1,3,544,992], static):	random (image/numpy array is expected)
[ INFO ] im_info  ([H,W], f32, [1,6], static):	random (binary data/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests using 1 streams for GPU, limits: 10000 iterations)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 167.01 ms
[ ERROR ] Exception from src/plugins/intel_gpu/src/runtime/ocl/ocl_event.cpp:56:
[GPU] clWaitForEvents, error code: -14

This is the device query output:

test@elkhart:~/openvino_cpp_samples_build/intel64/Release$ ./hello_query_device 
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] 
Illegal instruction (core dumped)

The error log:

Oct  9 17:21:14 elkhart kernel: [  267.440401] audit: type=1107 audit(1728454874.498:70): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct  9 17:21:14 elkhart kernel: [  267.440401]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:21:14 elkhart kernel: [  267.446291] audit: type=1107 audit(1728454874.506:71): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.104" pid=2032 label="snap.firefox.firefox" peer_pid=2147 peer_label="unconfined"
Oct  9 17:21:14 elkhart kernel: [  267.446291]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:23:51 elkhart kernel: [  424.217606] audit: type=1107 audit(1728455031.281:72): pid=619 uid=103 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/timedate1" interface="org.freedesktop.DBus.Properties" member="GetAll" mask="send" name=":1.122" pid=2032 label="snap.firefox.firefox" peer_pid=4039 peer_label="unconfined"
Oct  9 17:23:51 elkhart kernel: [  424.217606]  exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Oct  9 17:37:31 elkhart kernel: [ 1244.723409] traps: hello_query_dev[6474] trap invalid opcode ip:7f07388c0693 sp:7ffccea047f0 error:0 in libopenvino_intel_npu_plugin.so[7f073889f000+1e8000]
Oct  9 17:52:14 elkhart kernel: [ 2126.953072] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Oct  9 17:52:14 elkhart kernel: [ 2126.953096] i915 0000:00:02.0: [drm] benchmark_app[8047] context reset due to GPU hang
Oct  9 17:52:14 elkhart kernel: [ 2126.958102] i915 0000:00:02.0: [drm] GPU HANG: ecode 11:1:8ed9fff3, in benchmark_app [8047]
Oct  9 17:59:43 elkhart kernel: [ 2576.268271] traps: hello_query_dev[8614] trap invalid opcode ip:7f20d41d9693 sp:7ffdbe5a3170 error:0 in libopenvino_intel_npu_plugin.so[7f20d41b8000+1e8000]
Oct  9 18:00:06 elkhart kernel: [ 2599.234876] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Oct  9 18:00:06 elkhart kernel: [ 2599.234902] i915 0000:00:02.0: [drm] benchmark_app[8583] context reset due to GPU hang
Oct  9 18:00:06 elkhart kernel: [ 2599.239341] i915 0000:00:02.0: [drm] GPU HANG: ecode 11:1:85dcfffa, in benchmark_app [8583]

PrateekSingh · ‎10-09-2024

This is the device query output:

test@elkhart:~/openvino_cpp_samples_build/intel64/Release$ ./hello_query_device
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ]
Illegal instruction (core dumped)

PrateekSingh · ‎10-09-2024

I retried this with the latest openvino version (openvino-2024.4.0) and compute runtime driver 24.35.30872.22 , the error persists:

./benchmark_app -d GPU -m ~/intel/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.xml -hint none -niter 10000 -nstreams 1
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] 
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ] 
[ INFO ] 
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 52.17 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Network inputs:
[ INFO ]     data (node: data) : f32 / [N,C,H,W] / [1,3,544,992]
[ INFO ]     im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]
[ INFO ] Network outputs:
[ INFO ]     detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]
[Step 5/11] Resizing model to match image sizes and given batch
[Step 6/11] Configuring input of the model
[ INFO ] Model batch size: 1
[ INFO ] Network inputs:
[ INFO ]     data (node: data) : u8 / [N,C,H,W] / [1,3,544,992]
[ INFO ]     im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]
[ INFO ] Network outputs:
[ INFO ]     detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 1405.16 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: PVANet + R-FCN
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   PERF_COUNT: NO
[ INFO ]   ENABLE_CPU_PINNING: NO
[ INFO ]   MODEL_PRIORITY: MEDIUM
[ INFO ]   GPU_HOST_TASK_PRIORITY: MEDIUM
[ INFO ]   GPU_QUEUE_PRIORITY: MEDIUM
[ INFO ]   GPU_QUEUE_THROTTLE: MEDIUM
[ INFO ]   GPU_ENABLE_LOOP_UNROLLING: YES
[ INFO ]   GPU_DISABLE_WINOGRAD_CONVOLUTION: NO
[ INFO ]   CACHE_DIR: 
[ INFO ]   CACHE_MODE: optimize_speed
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: PERFORMANCE
[ INFO ]   COMPILATION_NUM_THREADS: 4
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   INFERENCE_PRECISION_HINT: f16
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 0
[ INFO ]   DEVICE_ID: 0
[ INFO ]   EXECUTION_DEVICES: GPU.0
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] data     ([N,C,H,W], u8, [1,3,544,992], static):	random (image/numpy array is expected)
[ INFO ] im_info  ([H,W], f32, [1,6], static):	random (binary data/numpy array is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests using 1 streams for GPU, limits: 10000 iterations)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 167.01 ms
[ ERROR ] Exception from src/plugins/intel_gpu/src/runtime/ocl/ocl_event.cpp:56:
[GPU] clWaitForEvents, error code: -14

This is the error log (the same error as the logs attached to the question)

Oct  9 17:37:31 elkhart kernel: [ 1244.723409] traps: hello_query_dev[6474] trap invalid opcode ip:7f07388c0693 sp:7ffccea047f0 error:0 in libopenvino_intel_npu_plugin.so[7f073889f000+1e8000]
Oct  9 17:52:14 elkhart kernel: [ 2126.953072] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Oct  9 17:52:14 elkhart kernel: [ 2126.953096] i915 0000:00:02.0: [drm] benchmark_app[8047] context reset due to GPU hang
Oct  9 17:52:14 elkhart kernel: [ 2126.958102] i915 0000:00:02.0: [drm] GPU HANG: ecode 11:1:8ed9fff3, in benchmark_app [8047]
Oct  9 17:59:43 elkhart kernel: [ 2576.268271] traps: hello_query_dev[8614] trap invalid opcode ip:7f20d41d9693 sp:7ffdbe5a3170 error:0 in libopenvino_intel_npu_plugin.so[7f20d41b8000+1e8000]
Oct  9 18:00:06 elkhart kernel: [ 2599.234876] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out
Oct  9 18:00:06 elkhart kernel: [ 2599.234902] i915 0000:00:02.0: [drm] benchmark_app[8583] context reset due to GPU hang
Oct  9 18:00:06 elkhart kernel: [ 2599.239341] i915 0000:00:02.0: [drm] GPU HANG: ecode 11:1:85dcfffa, in benchmark_app [8583]

Witold_Intel · ‎10-10-2024

Hi Prateek, thank you for the update. I will relay this output to the developers.

Witold_Intel · ‎10-11-2024

Hi Prateek, I don't have any suggestions from the developers yet. I will come back to you as soon as I know more.

Witold_Intel · ‎10-14-2024

I reminded the developers about your issue. Please have some more patience.

Witold_Intel · ‎10-24-2024

This is just a reminder that the developers haven't responded to this issue yet. I will send a reminder. Thank you for your patience.

Witold_Intel · ‎10-29-2024

Hi Prateek, would you be willing to share the person-detection model with us? This way we would be able to reproduce this issue on our in-house Elkhart Lake machine.

PrateekSingh · ‎10-29-2024

I have used the pretrained model described here: https://docs.openvino.ai/2024/omz_models_model_person_detection_retail_0002.html without any modification.

Witold_Intel · ‎10-31-2024

Hi Prateek, thanks for the link, unfortunately I am getting a "404 Page not found" error after trying it. Do you have another link or can attach the model files? Thanks in advance.

PrateekSingh · ‎10-31-2024

The page seems to have moved somewhere else. These are the model files:

https://storage.openvinotoolkit.org/repositories/open_model_zoo/2023.0/models_bin/1/person-detection-retail-0002/FP16-INT8/

https://storage.openvinotoolkit.org/repositories/open_model_zoo/2023.0/models_bin/1/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.xml

https://storage.openvinotoolkit.org/repositories/open_model_zoo/2023.0/models_bin/1/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.bin

Witold_Intel · ‎11-01-2024

Thank you for the files. I will attempt to reproduce this issue and get back to you.

Witold_Intel · ‎11-15-2024

Sorry for the long waiting time. I will prioritize the reproduction of your case and get back to you ASAP.

PrateekSingh · ‎11-28-2024

Following up on this.

Witold_Intel · ‎12-11-2024

Hi Preteek, I reproduced this benchmark on our Elkhart Lake machine and both CPU and GPU have worked. Here's the GPU result:

pse@aisw-nuci7-tgl-1165g7-m:~$ ./openvino_2023.0/openvino_env/bin/benchmark_app -m /home/pse/FP16-INT8/person-detection-retail-0002.xml GPU

[Step 1/11] Parsing and validating input arguments

[ INFO ] Parsing input parameters

usage: benchmark_app [-h [HELP]] [-i PATHS_TO_INPUT [PATHS_TO_INPUT ...]] -m PATH_TO_MODEL [-d TARGET_DEVICE] [-hint {throughput,tput,cumulative_throughput,ctput,latency,none}] [-niter NUMBER_ITERATIONS] [-t TIME] [-b BATCH_SIZE]

[-shape SHAPE] [-data_shape DATA_SHAPE] [-layout LAYOUT] [-extensions EXTENSIONS] [-c PATH_TO_CLDNN_CONFIG] [-cdir CACHE_DIR] [-lfile [LOAD_FROM_FILE]] [-api {sync,async}] [-nireq NUMBER_INFER_REQUESTS]

[-nstreams NUMBER_STREAMS] [-inference_only [INFERENCE_ONLY]] [-infer_precision INFER_PRECISION] [-ip {bool,f16,f32,f64,i8,i16,i32,i64,u8,u16,u32,u64}] [-op {bool,f16,f32,f64,i8,i16,i32,i64,u8,u16,u32,u64}]

[-iop INPUT_OUTPUT_PRECISION] [--mean_values [R,G,B]] [--scale_values [R,G,B]] [-nthreads NUMBER_THREADS] [-pin {YES,NO,NUMA,HYBRID_AWARE}] [-latency_percentile LATENCY_PERCENTILE]

[-report_type {no_counters,average_counters,detailed_counters}] [-report_folder REPORT_FOLDER] [-json_stats [JSON_STATS]] [-pc [PERF_COUNTS]] [-pcsort {no_sort,sort,simple_sort}] [-pcseq [PCSEQ]]

[-exec_graph_path EXEC_GRAPH_PATH] [-dump_config DUMP_CONFIG] [-load_config LOAD_CONFIG]

benchmark_app: error: unrecognized arguments: GPU

pse@aisw-nuci7-tgl-1165g7-m:~$ ./openvino_2023.0/openvino_env/bin/benchmark_app -m /home/pse/FP16-INT8/person-detection-retail-0002.xml -d GPU

[Step 1/11] Parsing and validating input arguments

[ INFO ] Parsing input parameters

[Step 2/11] Loading OpenVINO Runtime

[ INFO ] OpenVINO:

[ INFO ] Build ................................. 2023.0.1-11005-fa1c41994f3-releases/2023/0

[ INFO ]

[ INFO ] Device info:

[ INFO ] GPU

[ INFO ] Build ................................. 2023.0.1-11005-fa1c41994f3-releases/2023/0

[ INFO ]

[Step 3/11] Setting device configuration

[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to PerformanceMode.THROUGHPUT.

[Step 4/11] Reading model files

[ INFO ] Loading model files

[ INFO ] Read model took 95.45 ms

[ INFO ] Original model I/O parameters:

[ INFO ] Model inputs:

[ INFO ] data (node: data) : f32 / [N,C,H,W] / [1,3,544,992]

[ INFO ] im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]

[ INFO ] Model outputs:

[ INFO ] detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]

[Step 5/11] Resizing model to match image sizes and given batch

[ INFO ] Model batch size: 1

[Step 6/11] Configuring input of the model

[ INFO ] Model inputs:

[ INFO ] data (node: data) : u8 / [N,C,H,W] / [1,3,544,992]

[ INFO ] im_info:0 , im_info (node: im_info) : f32 / [H,W] / [1,6]

[ INFO ] Model outputs:

[ INFO ] detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]

[Step 7/11] Loading the model to the device

[ INFO ] Compile model took 6254.41 ms

[Step 8/11] Querying optimal runtime parameters

[ INFO ] Model:

[ INFO ] NETWORK_NAME: PVANet + R-FCN

[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4

[ INFO ] PERF_COUNT: NO

[ INFO ] MODEL_PRIORITY: Priority.MEDIUM

[ INFO ] GPU_HOST_TASK_PRIORITY: Priority.MEDIUM

[ INFO ] GPU_QUEUE_PRIORITY: Priority.MEDIUM

[ INFO ] GPU_QUEUE_THROTTLE: Priority.MEDIUM

[ INFO ] GPU_ENABLE_LOOP_UNROLLING: True

[ INFO ] CACHE_DIR:

[ INFO ] PERFORMANCE_HINT: PerformanceMode.THROUGHPUT

[ INFO ] EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE

[ INFO ] COMPILATION_NUM_THREADS: 8

[ INFO ] NUM_STREAMS: 2

[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0

[ INFO ] INFERENCE_PRECISION_HINT: <Type: 'float16'>

[ INFO ] DEVICE_ID: 0

[ INFO ] EXECUTION_DEVICES: ['GPU.0']

[Step 9/11] Creating infer requests and preparing input tensors

[ WARNING ] No input files were given for input 'data'!. This input will be filled with random values!

[ WARNING ] No input files were given for input 'im_info'!. This input will be filled with random values!

[ INFO ] Fill input 'data' with random values

[ INFO ] Fill input 'im_info' with random values

[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 60000 ms duration)

[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).

[ INFO ] First inference took 23.19 ms

[Step 11/11] Dumping statistics report

[ INFO ] Execution Devices:['GPU.0']

[ INFO ] Count: 3416 iterations

[ INFO ] Duration: 60147.38 ms

[ INFO ] Latency:

[ INFO ] Median: 58.61 ms

[ INFO ] Average: 70.25 ms

[ INFO ] Min: 19.10 ms

[ INFO ] Max: 150.39 ms

[ INFO ] Throughput: 56.79 FPS

In this case I would suggest it's a GPU setup issue, would you agree?

Witold_Intel · ‎12-16-2024

Hello Prateek,

Have you been able to read my previous message? I suspect there may be a problem with your GPU setup (eg. driver issue), are you able to check it with programs other than OpenVINO?

PrateekSingh · ‎12-18-2024

Hi Witold,

Thanks for looking into this. I have checked the driver/GPU and it seems to work fine with other applications eg: video streaming, load testing etc.

One key difference is that I ran the benchmark tool for much longer. I have observed that this issue appears when the GPU is under stress for a prolonged period of time, this is when InferenceEngine throws the OpenCL error.

Could you try the experiment again with the following command, notice that I'm running the benchmark app for 10000 iterations :

./benchmark_app -d GPU -m ~/intel/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.xml -hint none -niter 10000 -nstreams 1

You can also increase -nstreams to increase the GPU load, I have observed that leads to faster crashes.

Luis_at_Intel · ‎12-23-2024

Hi Prateek, I apologize for the delay in our response. I suspect the error you receive is from a misconfiguration of the GPU driver, do you know which driver version are you using? You can check the version by executing the following commands in a terminal window:

$ export gfx_version=$(dpkg-query --showformat='${Version}' --show intel-opencl-icd)

$ echo $gfx_version

Wanted to share that I've just tried this model with benchmark_app (same as my peer Witold) and the issue cannot be observed in the latest OpenVINO release 2024.6. Please upgrade your OpenVINO version to the latest v2024.6 and also make sure you have the latest compute-runtime (GPU driver).

Kindly note I used https://github.com/intel/compute-runtime/releases/tag/24.35.30872.22 (had to install ocl-icd-libopencl1 as a prerequisite) and this worked for me. Please have it a go on your side and let us know if the issue is resolved. Hope this helps.

Also can you share for how long do you run the benchmark_app before the issue begins to appear? And whether there are any other applications also utilizing the GPU concurrently with the benchmark_app?

CPU: Intel(R) Celeron(R) J6412 @ 2.00GHz

$ benchmark_app -m ~/intel/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.xml -d GPU -t 5 -api sync

[Step 1/11] Parsing and validating input arguments

[ INFO ] Parsing input parameters

[Step 2/11] Loading OpenVINO Runtime

[ INFO ] OpenVINO:

[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6

[ INFO ]

[ INFO ] Device info:

[ INFO ] GPU

[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6

[ INFO ]

[Step 3/11] Setting device configuration

[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to PerformanceMode.LATENCY.

[Step 4/11] Reading model files

[ INFO ] Loading model files

[ INFO ] Read model took 51.27 ms

[ INFO ] Original model I/O parameters:

[ INFO ] Model inputs:

[ INFO ] data (node: data) : f32 / [N,C,H,W] / [1,3,544,992]

[ INFO ] im_info , im_info:0 (node: im_info) : f32 / [H,W] / [1,6]

[ INFO ] Model outputs:

[ INFO ] detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]

[Step 5/11] Resizing model to match image sizes and given batch

[ INFO ] Model batch size: 1

[Step 6/11] Configuring input of the model

[ INFO ] Model inputs:

[ INFO ] data (node: data) : u8 / [N,C,H,W] / [1,3,544,992]

[ INFO ] im_info , im_info:0 (node: im_info) : f32 / [H,W] / [1,6]

[ INFO ] Model outputs:

[ INFO ] detection_out (node: detection_out) : f32 / [...] / [1,1,200,7]

[Step 7/11] Loading the model to the device

[ INFO ] Compile model took 1224.07 ms

[Step 8/11] Querying optimal runtime parameters

[ INFO ] Model:

[ INFO ] NETWORK_NAME: PVANet + R-FCN

[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1

[ INFO ] PERF_COUNT: False

[ INFO ] ENABLE_CPU_PINNING: False

[ INFO ] MODEL_PRIORITY: Priority.MEDIUM

[ INFO ] GPU_HOST_TASK_PRIORITY: Priority.MEDIUM

[ INFO ] GPU_QUEUE_PRIORITY: Priority.MEDIUM

[ INFO ] GPU_QUEUE_THROTTLE: Priority.MEDIUM

[ INFO ] GPU_ENABLE_LOOP_UNROLLING: True

[ INFO ] GPU_DISABLE_WINOGRAD_CONVOLUTION: False

[ INFO ] CACHE_DIR:

[ INFO ] CACHE_MODE: CacheMode.OPTIMIZE_SPEED

[ INFO ] PERFORMANCE_HINT: PerformanceMode.LATENCY

[ INFO ] EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE

[ INFO ] COMPILATION_NUM_THREADS: 4

[ INFO ] NUM_STREAMS: 1

[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0

[ INFO ] INFERENCE_PRECISION_HINT: <Type: 'float16'>

[ INFO ] DYNAMIC_QUANTIZATION_GROUP_SIZE: 32

[ INFO ] ACTIVATIONS_SCALE_FACTOR: 0.0

[ INFO ] DEVICE_ID: 0

[ INFO ] EXECUTION_DEVICES: ['GPU.0']

[Step 9/11] Creating infer requests and preparing input tensors

[ WARNING ] No input files were given for input 'data'!. This input will be filled with random values!

[ WARNING ] No input files were given for input 'im_info'!. This input will be filled with random values!

[ INFO ] Fill input 'data' with random values

[ INFO ] Fill input 'im_info' with random values

[Step 10/11] Measuring performance (Start inference synchronously, limits: 5000 ms duration)

[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).

[ INFO ] First inference took 167.02 ms

[Step 11/11] Dumping statistics report

[ INFO ] Execution Devices:['GPU.0']

[ INFO ] Count: 35 iterations

[ INFO ] Duration: 5070.34 ms

[ INFO ] Latency:

[ INFO ] Median: 144.74 ms

[ INFO ] Average: 144.72 ms

[ INFO ] Min: 144.21 ms

[ INFO ] Max: 145.21 ms

[ INFO ] Throughput: 6.90 FPS

Luis_at_Intel · ‎12-24-2024

Quick update, just ran the benchmark_app with throughput mode (nstreams = 2, nireqs = 4) for 100k iterations and it completed successfully without error. Test took about 3 hrs. At this current load, the GPU remains 100% utilized, I will try running a longer test and see if the issue occurs.

$ benchmark_app -m intel/person-detection-retail-0002/FP16-INT8/person-detection-retail-0002.xml -d GPU -niter 100000 -hint throughput

[Step 1/11] Parsing and validating input arguments

[ INFO ] Parsing input parameters

[Step 2/11] Loading OpenVINO Runtime

[ INFO ] OpenVINO:

[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6

[ INFO ]

[ INFO ] Device info:

[ INFO ] GPU

[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6

[ INFO ]

[...]

[ INFO ] Compile model took 1343.81 ms

[Step 8/11] Querying optimal runtime parameters

[ INFO ] Model:

[ INFO ] NETWORK_NAME: PVANet + R-FCN

[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4

[ INFO ] PERF_COUNT: False

[ INFO ] ENABLE_CPU_PINNING: False

[ INFO ] MODEL_PRIORITY: Priority.MEDIUM

[ INFO ] GPU_HOST_TASK_PRIORITY: Priority.MEDIUM

[ INFO ] GPU_QUEUE_PRIORITY: Priority.MEDIUM

[ INFO ] GPU_QUEUE_THROTTLE: Priority.MEDIUM

[ INFO ] GPU_ENABLE_LOOP_UNROLLING: True

[ INFO ] GPU_DISABLE_WINOGRAD_CONVOLUTION: False

[ INFO ] CACHE_DIR:

[ INFO ] CACHE_MODE: CacheMode.OPTIMIZE_SPEED

[ INFO ] PERFORMANCE_HINT: PerformanceMode.THROUGHPUT

[ INFO ] EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE

[ INFO ] COMPILATION_NUM_THREADS: 4

[ INFO ] NUM_STREAMS: 2

[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0

[ INFO ] INFERENCE_PRECISION_HINT: <Type: 'float16'>

[ INFO ] DYNAMIC_QUANTIZATION_GROUP_SIZE: 32

[ INFO ] ACTIVATIONS_SCALE_FACTOR: 0.0

[ INFO ] DEVICE_ID: 0

[ INFO ] EXECUTION_DEVICES: ['GPU.0']

[Step 9/11] Creating infer requests and preparing input tensors

[ WARNING ] No input files were given for input 'data'!. This input will be filled with random values!

[ WARNING ] No input files were given for input 'im_info'!. This input will be filled with random values!

[ INFO ] Fill input 'data' with random values

[ INFO ] Fill input 'im_info' with random values

[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 100000 iterations)

[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).

[ INFO ] First inference took 195.63 ms

[Step 11/11] Dumping statistics report

[ INFO ] Execution Devices:['GPU.0']

[ INFO ] Count: 100000 iterations

[ INFO ] Duration: 13812053.41 ms

[ INFO ] Latency:

[ INFO ] Median: 552.45 ms

[ INFO ] Average: 552.15 ms

[ INFO ] Min: 283.54 ms

[ INFO ] Max: 644.07 ms

[ INFO ] Throughput: 7.24 FPS