Model Optimizer fails on custom yolov3 model

Kulkarni__Mayur · ‎01-20-2020

I'm trying to convert a custom yolov3 model that uses has only `1 class` using the alexeyab's version of the Darknet. I successfully converted the weights to pb file but when converting from pb to IR the model optimizer fails with the following error:

[ ERROR ]  Cannot infer shapes or values for node "detector/yolo-v3/meshgrid_1/mul_1/YoloRegion".
[ ERROR ]  index 2 is out of bounds for axis 0 with size 2
[ ERROR ]
[ ERROR ]  It can happen due to bug in custom shape infer function <function RegionYoloOp.regionyolo_infer at 0x1a48a91dd0>.
[ ERROR ]  Or because the node inputs have incorrect values/shapes.
[ ERROR ]  Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ]  Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ]  index 2 is out of bounds for axis 0 with size 2

Next, I tried to debug the `/opt/intel/openvino/deployment_tools/model_optimizer/mo_tf.py` file to check the dimensions in the `RegionYoloOp` class and found out that the dimensions for my model during inference time in `regionyolo_infer` is

[38 38]

Which sort of looks odd. To compare, I ran the same command on the official yolov3 weights file (after converting to pb) and the dimensions were:

[  1  76  76 255]

[  1  38  38 255]

[  1  19  19 255]

Which look good and the conversions succeed. IMO there is some dimension mismatch happening. I trained my original model on `608x608` size which I specified in the command

python /opt/intel/openvino/deployment_tools/model_optimizer/mo_tf.py \

--input_model YOLO.pb \

--tensorflow_use_custom_operations_config /opt/intel/openvino/deployment_tools/model_optimizer/extensions/front/tf/yolo_v3.json \

--input_shape \[1,608,608,3\]

Furthermore, I also changed the `/opt/intel/openvino/deployment_tools/model_optimizer/extensions/front/tf/yolo_v3.json` to have only 1 class:

[

  {

    "id": "TFYOLOV3",

    "match_kind": "general",

    "custom_attributes": {

      "classes": 1,

      "anchors": [10, 13, 16, 30, 33, 23, 30, 61, 62, 45, 59, 119, 116, 90, 156, 198, 373, 326],

      "coords": 4,

      "num": 9,

      "masks":[[6, 7, 8], [3, 4, 5], [0, 1, 2]],

      "entry_points": ["detector/yolo-v3/Reshape", "detector/yolo-v3/Reshape_4", "detector/yolo-v3/Reshape_8"]

    }

  }

]

I'm curious if I've missed anything. Is it happening because I'm using AlexeyAB's version of Darknet? I don't think it should matter as they both use the same format unless I'm missing something.

PS: I'm attaching my PB file as a ZIP for you reference.

JesusE_Intel · ‎01-22-2020

Hi Mayur,

We have tested the YOLOv3 model from the official darknet repository, I am not sure what changes were made to the AlexeyAB version. Do you know which base model (weights file) you used for training from the alexeyab's version? Could you share your weights file, cfg file and a sample image? I can start a private message if you would like to send it privately.

Regards,

Jesus

pang__pih · ‎04-24-2020

Hi Jesus,

I am in the very simular situation( just trained with 2 classes, and image resoluton 416x416).

I can send my weighs, cfg and image if you reach out to me privately. (pihlung.pang@gmail.com)

Cheers

Pih Lung

python3 /opt/intel/openvino_2019.3.376/deployment_tools/model_optimizer/mo_tf.py --input_model ./asl_2class_1600_416x416.pb --input_shape=[1,416,416,3] --data_type=FP16 --tensorflow_use_custom_operations_config ./yolo_v3_2019.json --log_level=DEBUG

[ 2020-04-24 11:06:47,351 ] [ DEBUG ] [ infer:130 ]  Partial infer for detector/yolo-v3/meshgrid_1/mul_1
[ 2020-04-24 11:06:47,351 ] [ DEBUG ] [ infer:131 ]  Op: Mul
[ 2020-04-24 11:06:47,351 ] [ DEBUG ] [ infer:142 ]  Inputs:
[ 2020-04-24 11:06:47,351 ] [ DEBUG ] [ infer:32 ]  input[0]: shape = [26  1], value = [[ 0.]
 [ 1.]
 [ 2.]
 [ 3.]
 [ 4.]
 [ 5.]
 [ 6.]
 [ 7.]
 [ 8.]
 [ 9.]
 [10.]
 [11.]
 [12.]
 [13.]...
[ 2020-04-24 11:06:47,356 ] [ DEBUG ] [ infer:32 ]  input[1]: shape = [26 26], value = [[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
  1. 1.]
 [1. 1. 1. 1. ...
[ 2020-04-24 11:06:47,356 ] [ DEBUG ] [ infer:144 ]  Outputs:
[ 2020-04-24 11:06:47,361 ] [ DEBUG ] [ infer:32 ]  output[0]: shape = [26 26], value = [[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
   0.  0.  0.  0.  0.  ...
[ 2020-04-24 11:06:47,361 ] [ DEBUG ] [ infer:129 ]  --------------------
[ 2020-04-24 11:06:47,361 ] [ DEBUG ] [ infer:130 ]  Partial infer for detector/yolo-v3/meshgrid_1/mul_1/YoloRegion
[ 2020-04-24 11:06:47,361 ] [ DEBUG ] [ infer:131 ]  Op: RegionYolo
[ ERROR ]  Cannot infer shapes or values for node "detector/yolo-v3/meshgrid_1/mul_1/YoloRegion".
[ ERROR ]  index 2 is out of bounds for axis 0 with size 2
[ ERROR ]  
[ ERROR ]  It can happen due to bug in custom shape infer function <function RegionYoloOp.regionyolo_infer at 0x7f2252f2ddd0>.
[ ERROR ]  Or because the node inputs have incorrect values/shapes.
[ ERROR ]  Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ 2020-04-24 11:06:47,362 ] [ DEBUG ] [ infer:196 ]  Node "detector/yolo-v3/meshgrid_1/mul_1/YoloRegion" attributes: {'precision': 'FP32', 'kind': 'op', 'type': 'RegionYolo', 'op': 'RegionYolo', 'in_ports_count': 1, 'out_ports_count': 1, 'infer': <function RegionYoloOp.regionyolo_infer at 0x7f2252f2ddd0>, 'name': 'detector/yolo-v3/meshgrid_1/mul_1/YoloRegion', 'axis': 1, 'end_axis': 1, 'do_softmax': 0, 'classes': 2, 'anchors': [10, 13, 16, 30, 33, 23, 30, 61, 62, 45, 59, 119, 116, 90, 156, 198, 373, 326], 'coords': 4, 'num': 9, 'entry_points': ['detector/yolo-v3/Reshape', 'detector/yolo-v3/Reshape_4', 'detector/yolo-v3/Reshape_8'], 'mask': [0, 1, 2], 'dim_attrs': ['batch_dims', 'spatial_dims', 'channel_dims'], 'shape_attrs': ['shape', 'window', 'stride', 'output_shape', 'pad'], 'IE': [('layer', [('id', <function Op.substitute_ie_attrs.<locals>.<lambda> at 0x7f224bb690e0>), 'name', 'precision', 'type'], [('data', ['coords', 'classes', 'num', 'axis', 'end_axis', 'do_softmax', ('anchors', <function RegionYoloOp.backend_attrs.<locals>.<lambda> at 0x7f224bb69710>), ('mask', <function RegionYoloOp.backend_attrs.<locals>.<lambda> at 0x7f224bb69050>)], []), '@ports', '@consts'])], '_in_ports': {0: {}}, '_out_ports': {0: {}}, 'is_output_reachable': True, 'is_undead': False, 'is_const_producer': False, 'is_partial_inferred': False}
[ ERROR ]  index 2 is out of bounds for axis 0 with size 2
Stopped shape/value propagation at "detector/yolo-v3/meshgrid_1/mul_1/YoloRegion" node.

pang__pih · ‎04-24-2020

What base model (weights file) I am using?

My training environment was based on https://github.com/pjreddie/darknet

and I am/was using "darknet53.conv.74" as start weights.

Command to begin training:

./darknet -i 0 detector train /home/plp/yolo-v3/darknet/ASL_PT/asl_pt.data /home/plp/yolo-v3/darknet/ASL_PT/asl_2class.cfg /home/plp/yolo-v3/darknet/darknet53.conv.74 > /home/plp/yolo-v3/darknet/ASL_PT/train.log

Top lines of my asl_2class.cfg file

Top part
[net]
# Testing
# subdivisions=1
batch=16
#batch=1
# Training
subdivisions=16
width=416
height=416
#width=448
#height=448
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
learning_rate=0.001
burn_in=400
max_batches=4000
policy=steps
steps=1800
scales=.1


...
tail part


[yolo]
mask = 0,1,2
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes=2
num=9
jitter=.3
ignore_thresh = .5
truth_thresh = 1
random=1

Pise__Abhijeet · ‎05-29-2020

Jesus E. (Intel) wrote:
Hi Mayur,
We have tested the YOLOv3 model from the official darknet repository, I am not sure what changes were made to the AlexeyAB version. Do you know which base model (weights file) you used for training from the alexeyab's version? Could you share your weights file, cfg file and a sample image? I can start a private message if you would like to send it privately.
Regards,
Jesus

Hi Jesus,

I am trying to compare performance of two different architectures Faster RCNN INception V2 and YOLO v3 on my custom dataset for 3 classes.

I could convert the pre-trained models in openvino format however i am facing several challenges for the custom trained ones for both the networks.

Please check the FRCNN issue - https://software.intel.com/en-us/forums/intel-distribution-of-openvino-toolkit/topic/809407#comment-1959675

For YOLO V3, I am getting the below error:

[ 2020-05-29 16:19:59,321 ] [ DEBUG ] [ infer:128 ] Partial infer for detector/yolo-v3/meshgrid_1/mul_1/YoloRegion
[ 2020-05-29 16:19:59,321 ] [ DEBUG ] [ infer:129 ] Op: RegionYolo
[ ERROR ] Cannot infer shapes or values for node "detector/yolo-v3/meshgrid_1/mul_1/YoloRegion".
[ ERROR ] 'mask'
[ ERROR ]
[ ERROR ] It can happen due to bug in custom shape infer function <function RegionYoloOp.regionyolo_infer at 0x00000265CECFCE18>.
[ ERROR ] Or because the node inputs have incorrect values/shapes.
[ ERROR ] Or because input shapes are incorrect (embedded to the model or passed via --input_shape).

Though I am specifying the right input shape.

I wish to deploy the two models on Movidius and I am running out of time. It would be grateful if you could help..

Gilles__Brandon · ‎08-22-2020

Hi everyone!

Did anyone ever get to the bottom of this? I'm seeing the exact same problem from the model here:

https://github.com/PINTO0309/OpenVINO-YoloV3/blob/master/pbmodels/download_tiny-yolov3.sh

Progress: [....... ] 35.71% done
Cannot infer shapes or values for node "detector/yolo-v3-tiny/Conv/Conv2D".
index 3 is out of bounds for axis 0 with size 3
It can happen due to bug in custom shape infer function <function Convolution.infer at 0x7fcef42e76a8>.
Or because the node inputs have incorrect values/shapes.
Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
Run Model Optimizer with --log_level=DEBUG for more information.
[ ANALYSIS INFO ] Your model looks like YOLOv3 Model.
To generate the IR, provide TensorFlow YOLOv3 Model to the Model Optimizer with the following parameters:
--input_model <path_to_model>/yolo_v3.pb
--batch 1
--tensorflow_use_custom_operations_config <OPENVINO_INSTALL_DIR>/deployment_tools/model_optimizer/extensions/front/tf/yolo_v3.json
Detailed information about conversion of this model can be fount at
https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_convert_model_tf_specific_Convert_YOLO_From_Tensorflow.html
Exception occurred during running replacer "REPLACEMENT_ID" (<class 'extensions.middle.PartialInfer.PartialInfer'>): Stopped shape/value propagation at "detector/yolo-v3-tiny/Conv/Conv2D" node.
For more information please refer to Model Optimizer FAQ (https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_Model_Optimizer_FAQ.html), question #38.

Thanks,

Brandon