Model Optimizer: Cannot infer shapes or values for node issue

Ashdown__Lucinda · ‎03-25-2019

Hello,

I am attempting to optimize my resnet50 model from AWS. I keep receiving the following issue when trying to produce the IR. The input height and width of my original images (pixels) put through AWS ground truth are used (685,1024) in the input shape - not sure if this is correct...

 ~/intel/computer_vision_sdk_2018.5.455/deployment_tools/model_optimizer $ python3 mo_mxnet.py --input_model /home/luci/incubator-mxnet-master/example/ssd/model/deploy_ssd_resnet50_300-0000.params --input_symbol /home/luci/incubator-mxnet-master/example/ssd/model/deploy_ssd_resnet50_300-symbol.json --input_shape [1,3,685,1024]
Model Optimizer arguments:
Common parameters:
        - Path to the Input Model:      /home/luci/incubator-mxnet-master/example/ssd/model/deploy_ssd_resnet50_300-0000.params
        - Path for generated IR:        /home/luci/intel/computer_vision_sdk_2018.5.455/deployment_tools/model_optimizer/.
        - IR output name:       deploy_ssd_resnet50_300-0000
        - Log level:    ERROR
        - Batch:        Not specified, inherited from the model
        - Input layers:         Not specified, inherited from the model
        - Output layers:        Not specified, inherited from the model
        - Input shapes:         [1,3,685,1024]
        - Mean values:  Not specified
        - Scale values:         Not specified
        - Scale factor:         Not specified
        - Precision of IR:      FP32
        - Enable fusing:        True
        - Enable grouped convolutions fusing:   True
        - Move mean values to preprocess section:       False
        - Reverse input channels:       False
MXNet specific parameters:
        - Deploy-ready symbol file:     /home/luci/incubator-mxnet-master/example/ssd/model/deploy_ssd_resnet50_300-symbol.json
        - Enable MXNet loader for models trained with MXNet version lower than 1.0.0:   False
        - Prefix name for args.nd and argx.nd files:    None
        - Pretrained model to be merged with the .nd files:     None
        - Enable saving built parameters file from .nd files:   False
Model Optimizer version:        1.5.12.49d067a0
[ ERROR ]  Size of weights 9216 does not match kernel shape: [ 84 128   3   3]
    Possible reason is wrong channel number in input shape

[ ERROR ]  Cannot infer shapes or values for node "multi_feat_5_conv_3x3_relu_cls_pred_conv".
[ ERROR ]  Cannot reshape weights to kernel shape
[ ERROR ]
[ ERROR ]  It can happen due to bug in custom shape infer function <function Convolution.infer at 0x7f7d31a36d08>.
[ ERROR ]  Or because the node inputs have incorrect values/shapes.
[ ERROR ]  Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ]  Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ]  Stopped shape/value propagation at "multi_feat_5_conv_3x3_relu_cls_pred_conv" node.
 For more information please refer to Model Optimizer FAQ (<INSTALL_DIR>/deployment_tools/documentation/docs/MO_FAQ.html), question #38.

In addition, I get this error when running on a different device:

Python 2.7.12 (default, Dec  4 2017, 14:50:18)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import mo
>>> import mxnet
>>> input_height = 685
>>> input_width = 1024
>>> error, model_path = mo.optimize("deploy_ssd_resnet50_300", input_width, input_                                                                                      height ,"mx")
DEBUG:mo:DLDT command: python3 /opt/awscam/intel/deeplearning_deploymenttoolkit/de                                                                                      ployment_tools/model_optimizer/mo_mxnet.py --input_model /opt/awscam/artifacts/dep                                                                                      loy_ssd_resnet50_300-0000.params --data_type FP16 --scale 1 --model_name deploy_ss                                                                                      d_resnet50_300 --output_dir /opt/awscam/artifacts --reverse_input_channels  --inpu                                                                                      t_shape [1,3,685,1024]
Model Optimizer arguments
        Batch:  1
        Precision of IR:        FP16
        Enable fusing:  True
        Enable gfusing:         True
        Names of input layers:  inherited from the model
        Path to the Input Model:        /opt/awscam/artifacts/deploy_ssd_resnet50_                                                                                      300-0000.params
        Input shapes:   [1,3,685,1024]
        Log level:      ERROR
        Mean values:    ()
        IR output name:         deploy_ssd_resnet50_300
        Names of output layers:         inherited from the model
        Path for generated IR:  /opt/awscam/artifacts
        Reverse input channels:         True
        Scale factor:   1.0
        Scale values:   ()
        Version:        0.3.31.d8b314f6
        Prefix name for args.nd and argx.nd files:
        Name of pretrained model which will be merged with .nd files:
ERROR:mo:[ WARNING ]
Detected not satisfied dependencies:
        mxnet: installed: 1.3.0, required: 1.0.0

Please install required versions of components or use install_prerequisites script
/opt/awscam/intel/deeplearning_deploymenttoolkit/deployment_tools/model_optimizer/install_prerequisites/install_prerequisites_mxnet.sh
Note that install_prerequisites scripts may install additional components.
/usr/local/lib/python3.5/dist-packages/mxnet/module/base_module.py:55: UserWarning: You created Module with Module(..., label_names=['softmax_label']) but input with name 'softmax_label' is not found in symbol.list_arguments(). Did you mean one of:
        data
  warnings.warn(msg)
[ ERROR ]  Cannot infer shapes or values for node "multi_feat_3_conv_3x3_relu_cls_pred_conv".
[ ERROR ]  cannot reshape array of size 27648 into shape (126,256,3,3)
[ ERROR ]
[ ERROR ]  It can happen due to bug in custom shape infer function <function mxnet_conv2d_infer at 0x7f3dae1cff28>.
[ ERROR ]  Or because the node inputs have incorrect values/shapes.
[ ERROR ]  Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ]  Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ]  Stopped shape/value propagation at "multi_feat_3_conv_3x3_relu_cls_pred_conv" node. For more information please refer to Model Optimizer FAQ.

Are there any thoughts on how to get this working? All help is much appreciated.

Shubha_R_Intel · ‎03-25-2019

Dear Lucinda:

Can you kindly run with --log_level DEBUG and post the relevant ERROR part here ?

Thanks for using OpenVino !

Shubha

Ashdown__Lucinda · ‎03-26-2019

Hi Shubha,

Please see the additional information below:

This is the error generated from the first attempt above-

'dilation': array([1, 1, 1, 1]), 'output_spatial_shape': None, 'output_shape': None, 'stride': array([1, 1, 1, 1]), 'group': 1, 'output': 84, 'kernel_spatial': array([3, 3]), 'input_feature_channel': 1, 'output_feature_channel': 0, 'kernel_spatial_idx': None, 'reshape_kernel': True, 'spatial_dims': None, 'channel_dims': array([1]), 'batch_dims': array([0]), 'layout': 'NCHW', 'dim_attrs': ['spatial_dims', 'batch_dims', 'axis', 'channel_dims'], 'shape_attrs': ['output_shape', 'shape', 'pad', 'stride', 'window'], 'IE': [('layer', [('id', <function Op.substitute_ie_attrs.<locals>.<lambda> at 0x7fb5dc978268>), 'name', 'precision', 'type'], [('data', ['auto_pad', 'group', ('strides', <function Convolution.backend_attrs.<locals>.<lambda> at 0x7fb5dc978378>), ('dilations', <function Convolution.backend_attrs.<locals>.<lambda> at 0x7fb5dc9782f0>), ('kernel', <function Convolution.backend_attrs.<locals>.<lambda> at 0x7fb5dc978488>), ('pads_begin', <function Convolution.backend_attrs.<locals>.<lambda> at 0x7fb5dc978400>), ('pads_end', <function Convolution.backend_attrs.<locals>.<lambda> at 0x7fb5dc978598>), 'output'], []), '@ports', '@consts'])], 'is_output_reachable': True, 'is_undead': False, 'is_const_producer': False, 'is_output': False, 'is_partial_inferred': False}
[ ERROR ]  Stopped shape/value propagation at "multi_feat_5_conv_3x3_relu_cls_pred_conv" node.
 For more information please refer to Model Optimizer FAQ (<INSTALL_DIR>/deployment_tools/documentation/docs/MO_FAQ.html), question #38.
[ 2019-03-26 08:18:57,792 ] [ DEBUG ] [ main:331 ]  Traceback (most recent call last):
  File "/home/luci/intel/computer_vision_sdk_2018.5.455/deployment_tools/model_optimizer/mo/middle/passes/infer.py", line 153, in partial_infer
    node.infer(node)
  File "/home/luci/intel/computer_vision_sdk_2018.5.455/deployment_tools/model_optimizer/mo/ops/convolution.py", line 140, in infer
    raise Error("Cannot reshape weights to kernel shape")
mo.utils.error.Error: Cannot reshape weights to kernel shape

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/luci/intel/computer_vision_sdk_2018.5.455/deployment_tools/model_optimizer/mo/main.py", line 325, in main
    return driver(argv)
  File "/home/luci/intel/computer_vision_sdk_2018.5.455/deployment_tools/model_optimizer/mo/main.py", line 287, in driver
    mean_scale_values=mean_scale)
  File "/home/luci/intel/computer_vision_sdk_2018.5.455/deployment_tools/model_optimizer/mo/pipeline/mx.py", line 162, in driver
    graph = partial_infer(graph)
  File "/home/luci/intel/computer_vision_sdk_2018.5.455/deployment_tools/model_optimizer/mo/middle/passes/infer.py", line 217, in partial_infer
    refer_to_faq_msg(38)) from err
mo.utils.error.Error: Stopped shape/value propagation at "multi_feat_5_conv_3x3_relu_cls_pred_conv" node.
 For more information please refer to Model Optimizer FAQ (<INSTALL_DIR>/deployment_tools/documentation/docs/MO_FAQ.html), question #38.

Below is the second attempt displayed above debug (AWS device which has this pre-installed):

[ ERROR ] [ main:227 ]  Stopped shape/value propagation at "multi_feat_3_conv_3x3_relu_cls_pred_conv" node. For more information please refer to Model Optimizer FAQ, question #38.
[ 2019-03-25 11:02:19,032 ] [ DEBUG ] [ main:228 ]  Traceback (most recent call last):
  File "/opt/awscam/intel/deeplearning_deploymenttoolkit/deployment_tools/model_optimizer/mo/middle/passes/infer.py", line 73, in partial_infer
    node.infer(node)
  File "/opt/awscam/intel/deeplearning_deploymenttoolkit/deployment_tools/model_optimizer/mo/front/common/partial_infer/convolution.py", line 133, in mxnet_conv2d_infer
    weights.value.shape = weights.shape
ValueError: cannot reshape array of size 27648 into shape (126,256,3,3)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/awscam/intel/deeplearning_deploymenttoolkit/deployment_tools/model_optimizer/mo/main.py", line 222, in main
    return driver(argv)
  File "/opt/awscam/intel/deeplearning_deploymenttoolkit/deployment_tools/model_optimizer/mo/main.py", line 208, in driver
    mean_scale_values=mean_scale)
  File "/opt/awscam/intel/deeplearning_deploymenttoolkit/deployment_tools/model_optimizer/mo/pipeline/mx.py", line 106, in driver
    graph = partial_infer(graph)
  File "/opt/awscam/intel/deeplearning_deploymenttoolkit/deployment_tools/model_optimizer/mo/middle/passes/infer.py", line 123, in partial_infer
    'For more information please refer to Model Optimizer FAQ, question #38.') from err
mo.utils.error.Error: Stopped shape/value propagation at "multi_feat_3_conv_3x3_relu_cls_pred_conv" node. For more information please refer to Model Optimizer FAQ, question #38.

Please let me know if any more information is required. Thank you!

Shubha_R_Intel · ‎03-26-2019

Dear Lucinda:

I have sent you a PM to enable sending me your model privately. It's difficult for me to tell what went wrong by simply looking at your logs. Also can you please tell me the exact MO command you used ?

Thanks for using OpenVino !

Shubha

Shubha_R_Intel · ‎03-26-2019

Dear Lucinda,

As per your instruction, the command I ran was as follows:

python "c:\Program Files (x86)\IntelSWTools\openvino\deployment_tools\model_optimizer\mo_mxnet.py" --input_model C:\Users\sdramani\Downloads\compmxnet13\deploy_ssd_resnet50_300-0000.params --input_symbol C:\Users\sdramani\Downloads\compmxnet13\deploy_ssd_resnet50_300-symbol.json --input_shape [1,3,685,1024] --log_level DEBUG

Here is what I found. The problematic node is this one from your deploy_ssd_resnet50_300-symbol.json file:{
{
"op": "Convolution",
"name": "multi_feat_5_conv_3x3_relu_cls_pred_conv",
"attrs": {
"kernel": "(3, 3)",
"num_filter": "84",
"pad": "(1, 1)",
"stride": "(1, 1)"
},
"inputs": [[480, 0, 0], [481, 0, 0], [482, 0, 0]]
},

The above layer has a shape of [84 128 3 3] but the incoming weights have a shape of [8, 128, 3, 3]. If you'll notice 8*128*3*3 exactly = 9216

The problem is that 84*128*3*3 does not = 9216.

[ ERROR ] Size of weights 9216 does not match kernel shape: [ 84 128 3 3]
Possible reason is wrong channel number in input shape

Shubha_R_Intel · ‎03-27-2019

[Responding to Lucinda's PM messages here, regarding --input_shape, what to use, etc...]

Dear Lucinda, I understand that finding out what to pass into --input_shape could be challenging. There is a famous formula :

output_size = ((input_size + 2*padding - filter_size)/stride) + 1

Another form of this formula is W = ((W - F + 2P)/S) + 1

where W = Width

F = is Filter Width

P = Padding

S = Stride

For H replace W with H.

Please go through your deploy_ssd_resnet50_300-symbol.json and calculate what the correct output_sizes should be (which are in turn fed in as input to the next layer). The Very First input you pass in via --input_shape is quite important. So in --input_shape [1,3,685,1024] the 1 is batch_size, 3 is number_of_channels and 685 and 1204 are the dimensions of your image. And remember also If there are multiple inputs in the model, --input_shape should contain definition of shape for each input separated by a comma, for example: [1,3,227,227],[2,4] for a model with two inputs with 4D and 2D shapes. In your situation, it seems like there is just one input to the model however.

My hunch is that you will either have to change the json file or your --input_shape to get mo_mxnet.py to pass. You can also try --disable_resnet_optimization and see if that makes a difference. Do a python mo_mxnet.py --help and start experimenting with some of the other options.

Hope it helps and thanks for using OpenVino !

Shubha