- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We are running YOLOv3 model inference using OpenVINO, and see 100% CPU usage when using `-d GPU`. This is reproducible with the official object_detection_demo_yolov3_async example code.
Checking the per-layer performance yields this:
performance counts: detector/darknet-53/Conv/C... EXECUTED layerType: Convolution realTime: 1949 cpu: 7 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12948 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_1... EXECUTED layerType: Convolution realTime: 4964 cpu: 6 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12923 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_2... EXECUTED layerType: Convolution realTime: 1059 cpu: 6 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12921 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_3... EXECUTED layerType: Convolution realTime: 4818 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12891 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add EXECUTED layerType: Eltwise realTime: 7393 cpu: 6 execType: generic_eltwise_ref detector/darknet-53/Conv_4... EXECUTED layerType: Convolution realTime: 4794 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12936 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_5... EXECUTED layerType: Convolution realTime: 620 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12915 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_6... EXECUTED layerType: Convolution realTime: 4848 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12951 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_1 EXECUTED layerType: Eltwise realTime: 905 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_7... EXECUTED layerType: Convolution realTime: 600 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12907 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_8... EXECUTED layerType: Convolution realTime: 4767 cpu: 6 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_ NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_2 EXECUTED layerType: Eltwise realTime: 3557 cpu: 6 execType: generic_eltwise_ref detector/darknet-53/Conv_9... EXECUTED layerType: Convolution realTime: 4914 cpu: 6 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12894 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_1... EXECUTED layerType: Convolution realTime: 571 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12906 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_1... EXECUTED layerType: Convolution realTime: 4776 cpu: 6 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12896 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_3 EXECUTED layerType: Eltwise realTime: 375 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_1... EXECUTED layerType: Convolution realTime: 583 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12913 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_1... EXECUTED layerType: Convolution realTime: 4788 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12918 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_4 EXECUTED layerType: Eltwise realTime: 337 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_1... EXECUTED layerType: Convolution realTime: 575 cpu: 6 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12884 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_1... EXECUTED layerType: Convolution realTime: 4857 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12933 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_5 EXECUTED layerType: Eltwise realTime: 341 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_1... EXECUTED layerType: Convolution realTime: 584 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12909 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_1... EXECUTED layerType: Convolution realTime: 4734 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12902 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_6 EXECUTED layerType: Eltwise realTime: 335 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_1... EXECUTED layerType: Convolution realTime: 604 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12934 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_1... EXECUTED layerType: Convolution realTime: 4812 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12927 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_7 EXECUTED layerType: Eltwise realTime: 337 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_2... EXECUTED layerType: Convolution realTime: 601 cpu: 6 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12901 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_2... EXECUTED layerType: Convolution realTime: 4837 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12929 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_8 EXECUTED layerType: Eltwise realTime: 336 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_2... EXECUTED layerType: Convolution realTime: 595 cpu: 6 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12881 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_2... EXECUTED layerType: Convolution realTime: 4834 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12920 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_9 EXECUTED layerType: Eltwise realTime: 342 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_2... EXECUTED layerType: Convolution realTime: 568 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12882 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_2... EXECUTED layerType: Convolution realTime: 4752 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12945 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_10 EXECUTED layerType: Eltwise realTime: 1859 cpu: 5 execType: generic_eltwise_ref detector/darknet-53/Conv_2... EXECUTED layerType: Convolution realTime: 5298 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12911 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_2... EXECUTED layerType: Convolution realTime: 692 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12899 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_2... EXECUTED layerType: Convolution realTime: 5227 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12903 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_11 EXECUTED layerType: Eltwise realTime: 125 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_2... EXECUTED layerType: Convolution realTime: 676 cpu: 6 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12912 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_3... EXECUTED layerType: Convolution realTime: 5349 cpu: 6 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12939 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_12 EXECUTED layerType: Eltwise realTime: 128 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_3... EXECUTED layerType: Convolution realTime: 678 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12885 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_3... EXECUTED layerType: Convolution realTime: 5248 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12922 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_13 EXECUTED layerType: Eltwise realTime: 124 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_3... EXECUTED layerType: Convolution realTime: 678 cpu: 6 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12950 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_3... EXECUTED layerType: Convolution realTime: 5221 cpu: 6 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12914 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_14 EXECUTED layerType: Eltwise realTime: 123 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_3... EXECUTED layerType: Convolution realTime: 672 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12916 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_3... EXECUTED layerType: Convolution realTime: 5285 cpu: 6 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12910 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_15 EXECUTED layerType: Eltwise realTime: 139 cpu: 11 execType: eltwise_simple_vload8 detector/darknet-53/Conv_3... EXECUTED layerType: Convolution realTime: 675 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12942 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_3... EXECUTED layerType: Convolution realTime: 5212 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12935 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_16 EXECUTED layerType: Eltwise realTime: 137 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_3... EXECUTED layerType: Convolution realTime: 674 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12898 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_4... EXECUTED layerType: Convolution realTime: 5194 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12925 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_17 EXECUTED layerType: Eltwise realTime: 130 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_4... EXECUTED layerType: Convolution realTime: 674 cpu: 6 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12886 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_4... EXECUTED layerType: Convolution realTime: 5285 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12892 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_18 EXECUTED layerType: Eltwise realTime: 928 cpu: 5 execType: generic_eltwise_ref detector/darknet-53/Conv_4... EXECUTED layerType: Convolution realTime: 5513 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12941 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_4... EXECUTED layerType: Convolution realTime: 772 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12897 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_4... EXECUTED layerType: Convolution realTime: 5244 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12930 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_19 EXECUTED layerType: Eltwise realTime: 77 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_4... EXECUTED layerType: Convolution realTime: 769 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12887 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_4... EXECUTED layerType: Convolution realTime: 5469 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12943 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_20 EXECUTED layerType: Eltwise realTime: 70 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_4... EXECUTED layerType: Convolution realTime: 791 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12928 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_4... EXECUTED layerType: Convolution realTime: 5227 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12931 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_21 EXECUTED layerType: Eltwise realTime: 71 cpu: 5 execType: eltwise_simple_vload8 detector/darknet-53/Conv_5... EXECUTED layerType: Convolution realTime: 767 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12890 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/Conv_5... EXECUTED layerType: Convolution realTime: 5211 cpu: 9 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12895 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/darknet-53/add_22 EXECUTED layerType: Eltwise realTime: 71 cpu: 5 execType: eltwise_simple_vload8 detector/yolo-v3/Conv/Conv2D EXECUTED layerType: Convolution realTime: 835 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12905 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_1/Co... EXECUTED layerType: Convolution realTime: 5198 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12938 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_2/Co... EXECUTED layerType: Convolution realTime: 790 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12944 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_3/Co... EXECUTED layerType: Convolution realTime: 5229 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12888 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_4/Co... EXECUTED layerType: Convolution realTime: 770 cpu: 8 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12917 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_5/Co... EXECUTED layerType: Convolution realTime: 5323 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 detector/yolo-v3/Conv_7/Co... EXECUTED layerType: Convolution realTime: 254 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12900 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef LeakyReLU_12940 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_6/Co... EXECUTED layerType: Convolution realTime: 117 cpu: 7 execType: convolution_gpu_bfyx_os_iyx_osv16 detector/yolo-v3/ResizeNea... EXECUTED layerType: Resample realTime: 73 cpu: 5 execType: undef detector/yolo-v3/concat_3 EXECUTED layerType: Concat realTime: 602 cpu: 5 execType: concatenation_gpu_ref detector/yolo-v3/Conv_6/Bi... EXECUTED layerType: RegionYolo realTime: 89 cpu: 6 execType: region_yolo_gpu_ref detector/yolo-v3/Conv_8/Co... EXECUTED layerType: Convolution realTime: 997 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12946 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_9/Co... EXECUTED layerType: Convolution realTime: 5297 cpu: 6 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12937 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_10/C... EXECUTED layerType: Convolution realTime: 678 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12919 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_11/C... EXECUTED layerType: Convolution realTime: 5213 cpu: 7 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12949 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_12/C... EXECUTED layerType: Convolution realTime: 706 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12883 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_13/C... EXECUTED layerType: Convolution realTime: 5305 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 detector/yolo-v3/Conv_15/C... EXECUTED layerType: Convolution realTime: 223 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 LeakyReLU_12932 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef LeakyReLU_12947 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_14/C... EXECUTED layerType: Convolution realTime: 112 cpu: 6 execType: convolution_gpu_bfyx_os_iyx_osv16 detector/yolo-v3/ResizeNea... EXECUTED layerType: Resample realTime: 119 cpu: 10 execType: undef detector/yolo-v3/concat_7 EXECUTED layerType: Concat realTime: 1164 cpu: 5 execType: concatenation_gpu_ref detector/yolo-v3/Conv_14/B... EXECUTED layerType: RegionYolo realTime: 77 cpu: 5 execType: region_yolo_gpu_ref detector/yolo-v3/Conv_16/C... EXECUTED layerType: Convolution realTime: 891 cpu: 6 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12908 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_17/C... EXECUTED layerType: Convolution realTime: 4747 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12926 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_18/C... EXECUTED layerType: Convolution realTime: 567 cpu: 6 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12924 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_19/C... EXECUTED layerType: Convolution realTime: 4743 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12889 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_20/C... EXECUTED layerType: Convolution realTime: 560 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12893 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_21/C... EXECUTED layerType: Convolution realTime: 5016 cpu: 5 execType: convolution_gpu_bfyx_gemm_like LeakyReLU_12904 NOT_RUN layerType: ReLU realTime: 0 cpu: 0 execType: undef detector/yolo-v3/Conv_22/C... EXECUTED layerType: Convolution realTime: 231 cpu: 5 execType: convolution_gpu_bfyx_os_iyx_osv16 detector/yolo-v3/Conv_22/B... EXECUTED layerType: RegionYolo realTime: 97 cpu: 5 execType: region_yolo_gpu_ref Total time: 233168 microseconds [ INFO ] Execution successful
We profiled the application, and it seems like there is a lot of usage in the OpenCL library:
We then tried running on an NCS2 (-d MYRIAD), and it works great, with lower CPU usage (~20%).
It almost seems that there is some sort of a busy-wait in the GPU case. Any suggestions would be highly appreciated.
OpenVino toolkit version: 2019 R1. Also reproducible on R3.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mohammed,
Thanks for reaching out, could you share a link to the model that you are using and the model optimizer command used to convert to IR format? I would like to run some tests from my end. Also, could you share some details of the hardware you are testing on?
You mentioned the issue is also seen on the OpenVINO toolkit 2019 R3, have you tested on 2019 R3.1?
Regards,
Jesus
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
IR and models sent via PM.
These were generated with instructions from here: https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_convert_model_tf_specific_Convert_YOLO_From_Tensorflow.html
Exact commands:
# Convert Darknet weights to TensorFlow weights python3 convert_weights_pb.py --class_names barcode.names --data_format NHWC --weights_file barcode.weights # Convert frozen tensorflow weights to Inference Engine IR mo_tf.py \ --input_model frozen_darknet_yolov3_model.pb \ --data_type FP16 \ --batch 1 \ --tensorflow_use_custom_operations_config yolo_v3.json
If you need inference test data for the above model, I can privately email/message it to you.
We tried running on a bunch of platforms, including i7-7600U and Xeon E3-1505M v5. Reproducible on both.
Thanks for the help! We currently are using an NCS 2 to avoid high CPU load, but would love to go back to the integrated graphics once this issue is solved. I'll give it a try on R3.1 now.
Regards,
Kabir
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mohammed,
I was able to reproduce the issue with the information you provided. I am reaching out to the development team for additional input.
Regards,
Jesus
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mohameed,
Thank you for your patience, I got some feedback from the development team.
This is a known issue with the Intel GPU. If CPU usage is an issue for your application and you don't mind sacrificing some wait time, the problem can be mitigated by setting the GPU plugin config key KEY_CLDNN_PLUGIN_THROTTLE to lower value 1. This will cause the driver polling thread to periodically sleep and preempt, removing most of the overhead.
The plugin configuration parameters need to be set before calling the IE LoadNetwork.
#include <cldnn/cldnn_config.hpp>
ie.SetConfig({ { CLDNNConfigParams::KEY_CLDNN_PLUGIN_THROTTLE, "1" } });
Regards,
Jesus
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, that resolved it!
It would be great if the documentation could be updated with far greater emphasis on this pitfall.
Regards,
Kabir
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Kabir,
Thank you for confirming, glad it's working for you!
Regards,
Jesus
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page