Re:benchmark_app sample example fails for GPU in Intel Dev Cloud Beta

fsant · ‎07-21-2023

I've tried to reproduce the examples from https://docs.openvino.ai/2023.0/openvino_inference_engine_samples_benchmark_app_README.html#examples-of-running-the-tool in the Intel Developer Cloud Beta (Scheduled access - Intel® Max Series GPU (PVC) on 4th Gen Intel® Xeon® processors - 1100 series (4x)). The benchmark works for the CPU but fails for the GPU.

I'm using the provided openvino conda environment.

$ conda activate openvino
$ omz_downloader --name asl-recognition-0004 --precisions FP16 --output_dir omz_models

$ benchmark_app -m omz_models/intel/asl-recognition-0004/FP16/asl-recognition-0004.xml -d CPU -hint latency -t 10
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2023.0.0-10926-b4452d56304-releases/2023/0
[ INFO ] 
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2023.0.0-10926-b4452d56304-releases/2023/0
[ INFO ] 
[ INFO ] 
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 16.29 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     input (node: input) : f32 / [N,C,D,H,W] / [1,3,16,224,224]
[ INFO ] Model outputs:
[ INFO ]     output (node: output) : f32 / [...] / [1,100]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     input (node: input) : f32 / [N,C,D,H,W] / [1,3,16,224,224]
[ INFO ] Model outputs:
[ INFO ]     output (node: output) : f32 / [...] / [1,100]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 209.04 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 2
[ INFO ]   NUM_STREAMS: 2
[ INFO ]   AFFINITY: Affinity.CORE
[ INFO ]   INFERENCE_NUM_THREADS: 112
[ INFO ]   PERF_COUNT: False
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'bfloat16'>
[ INFO ]   PERFORMANCE_HINT: PerformanceMode.LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: True
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   ENABLE_HYPER_THREADING: False
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'input'!. This input will be filled with random values!
[ INFO ] Fill input 'input' with random values 
[Step 10/11] Measuring performance (Start inference asynchronously, 2 inference requests, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 10.10 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            4732 iterations
[ INFO ] Duration:         10006.80 ms
[ INFO ] Latency:
[ INFO ]    Median:        4.12 ms
[ INFO ]    Average:       4.19 ms
[ INFO ]    Min:           3.69 ms
[ INFO ]    Max:           19.70 ms
[ INFO ] Throughput:   472.88 FPS

$ benchmark_app -m omz_models/intel/asl-recognition-0004/FP16/asl-recognition-0004.xml -d GPU -hint throughput
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2023.0.0-10926-b4452d56304-releases/2023/0
[ INFO ] 
[ INFO ] Device info:
[ INFO ] GPU
[ INFO ] Build ................................. 2023.0.0-10926-b4452d56304-releases/2023/0
[ INFO ] 
[ INFO ] 
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 18.23 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     input (node: input) : f32 / [N,C,D,H,W] / [1,3,16,224,224]
[ INFO ] Model outputs:
[ INFO ]     output (node: output) : f32 / [...] / [1,100]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     input (node: input) : f32 / [N,C,D,H,W] / [1,3,16,224,224]
[ INFO ] Model outputs:
[ INFO ]     output (node: output) : f32 / [...] / [1,100]
[Step 7/11] Loading the model to the device
[ ERROR ] Check 'false' failed at src/inference/src/core.cpp:114:
Check 'false' failed at src/plugins/intel_gpu/src/plugin/program.cpp:384:
GPU program build failed!
Check 'false' failed at src/plugins/intel_gpu/src/graph/include/primitive_type_base.h:58:
[GPU] Can't choose implementation for convolution:Multiply_8591 node (type=convolution)
[GPU] Original name: Multiply_8591
[GPU] Original type: Convolution
[GPU] Reason: Unsupported onednn dnnl::memory::desc find_format. ndims: 5, inner_nblks: 2, inner_blks: (blk 16, idx 0) (blk 2, idx 1) , strides_order : 0 1 2 3 4 , strides_order : 0 2 1 3 4 , stride_value : 576 288 288 96 32 


Traceback (most recent call last):
  File "/home/common/miniconda3/envs/openvino/lib/python3.10/site-packages/openvino/tools/benchmark/main.py", line 408, in main
    compiled_model = benchmark.core.compile_model(model, benchmark.device, device_config)
  File "/home/common/miniconda3/envs/openvino/lib/python3.10/site-packages/openvino/runtime/ie_api.py", line 398, in compile_model
    super().compile_model(model, device_name, {} if config is None else config),
RuntimeError: Check 'false' failed at src/inference/src/core.cpp:114:
Check 'false' failed at src/plugins/intel_gpu/src/plugin/program.cpp:384:
GPU program build failed!
Check 'false' failed at src/plugins/intel_gpu/src/graph/include/primitive_type_base.h:58:
[GPU] Can't choose implementation for convolution:Multiply_8591 node (type=convolution)
[GPU] Original name: Multiply_8591
[GPU] Original type: Convolution
[GPU] Reason: Unsupported onednn dnnl::memory::desc find_format. ndims: 5, inner_nblks: 2, inner_blks: (blk 16, idx 0) (blk 2, idx 1) , strides_order : 0 1 2 3 4 , strides_order : 0 2 1 3 4 , stride_value : 576 288 288 96 32

Aznie_Intel · ‎07-21-2023

Hi Fsant,

Thanks for reaching out.

I'm able to run the benchmark_app inference with asl-recognition-0004 on GPU plugin.

Your issue is perhaps due to the unconfigured GPU in the system. Please try to configure the GPU plugin by following these Configurations for Intel® Processor Graphics (GPU) with OpenVINO™ based on your operating system.

Regards,

Aznie

fsant · ‎07-24-2023

Hi Aznie,

Ok, thanks. Since the system is the Intel Dev Cloud I sent a support request for them.

Best regards,

Fernando

Aznie_Intel · ‎07-27-2023

Hi Fsant,

Thank you for your question. If you need any additional information from Intel, please submit a new question as this thread is no longer being monitored.

Regards,

Aznie