Re: Re:Converting Maskrcnn to IR error

DanielSerna · ‎07-15-2020

Hello intel team,

I'm trying to convert a custom mask rcnn tensorflow (keras) to the openvino intermediate representation, more specifically from this repository: https://github.com/matterport/Mask_RCNN.

I'm using openvino version 2020.1.023

first step I retrained from other dataset and got the model in .h5 format, so in order to freeze it I called the model class:

model = modellib.MaskRCNN(mode="inference", config=config, model_dir=args.logs)

load the weights:

model.load_weights(weights_path, by_name=True)

then I used this function to convert the weights or to freeze:

def h5_to_pb(h5_model, output_dir, model_name, out_prefix="output_"):
out_nodes = []
for i in range(len(h5_model.outputs)):
out_nodes.append(out_prefix + str(i + 1))
tf.identity(h5_model.output[i], out_prefix + str(i + 1))
sess = K.get_session()
init_graph = sess.graph.as_graph_def()
main_graph = tf._api.v1.graph_util.convert_variables_to_constants(sess, init_graph, out_nodes)
with tf.gfile.GFile(os.path.join(output_dir, model_name), "wb") as filemodel:
filemodel.write(main_graph.SerializeToString())
print("pb model: ", {os.path.join(output_dir, model_name)})

h5_to_pb(model.keras_model, output_dir='./', model_name='converted_model.pb')

K.clear_session()

Is that right so far ? or is there any problem about this step ?

So far we generated the freezed model: converted_model.pb

now I try to convert it to the IR with this command:

python3 /opt/intel/openvino/deployment_tools/model_optimizer/mo_tf.py --input_model ../../samples/eggs/converted_model.pb --input input_image,input_image_meta,input_anchors --input_shape [1,1024,1024,3],[1,14],[1,261888,4] --tensorflow_object_detection_api_pipeline_config ../../pipeline.config --transformations_config /opt/intel/openvino/deployment_tools/model_optimizer/extensions/front/tf/mask_rcnn_support_api_v1.14.json --log_level DEBUG 2>&1 | tee debug.txt

I have several questions in this step...

the 1.14 in "mask_rcnn_support_api_v1.14.json" corresponds to the tensorflow version ?

in the json what do I have to change according to the custom graph ?

I got the pipeline from other maskrcnn repository, but not sure what do I have to change in the pipeline.config I see there are some PATH_TO_BE_CONFIGURED, but don't have any checkpoints (.ckpt), or .pbtxt , or .record ?

I'm goint to attach the debug.txt, the pipeline.config, and there is the command to run the conversion to the IR, also the mask_rcnn_support_api_v1.14.json is the default of openvino (I haven't changed a thing) could you help me find out what it's wrong ?

I tried to attach the .pb model but it exceeded the size you could attach, any other way to send it to you ? or maybe with this info is enopugh ?

Thanks a lot in advance.

Munesh_Intel · ‎07-17-2020

Hi Daniel,

Thanks for reaching out. mask_rcnn_support_api_v1.14.json is used for Mask R-CNN topologies trained manually using the TensorFlow Object Detection API version 1.14.0 up to 1.14.X inclusively.

More information is available at the following page:

https://docs.openvinotoolkit.org/2020.4/openvino_docs_MO_DG_prepare_model_convert_model_tf_specific_Convert_Object_Detection_API_Models.html#how_to_convert_a_model

Please share the trained model files for us to reproduce your issue. (You can share a link from Google Drive). Additionally, please share more information about your model, whether it's an object detection/classification model, the layers used if it's a custom model and also environment details (versions of OS, TF, Python, etc.).

Regards,

Munesh

DanielSerna · ‎07-17-2020

Hi Munesh,

Thanks a lot for your response,

I'm using Ubuntu 18.04.4 LTS, python 3.6.9, tensorflow 1.14.0 for the conversion (altough I think the training was done using some TF 1.13.1 version I'm verifying this, also will try to retrain again with 1.14).

here is the converted model:

https://drive.google.com/file/d/1gL3UiZOynWuLxbnbUM8VO4NIECwFoMGB/view?usp=sharing

here is the non converted model in .h5:

https://drive.google.com/file/d/1k-sufMQ-jo-RZuXH-5dOpjC4Hfp4W6zM/view?usp=sharing

about the model it's and object detection model for eggs, it only have 2 classes (egg or background), and it has been retrained from the https://github.com/matterport/Mask_RCNN repository only retraining the heads layers, so its based on the coco trained weights.

also I tried to convert to IR some already freezed weights (.pb) from model zoo with similar arguments and it worked, so I may be thinking that my problem is during the freezing of the custom model.

thanks a lot and looking forward for your help.

DanielSerna · ‎07-23-2020

Hi Munesh @Munesh_Intel ,

any idea of what is going on ? anything I could try or do different ?

Thank you !

Munesh_Intel · ‎07-23-2020

Hi Daniel,

I was just about to reply you.

We suspect that the issue lies in freezing the model. I suggest you try the following workarounds to freeze the model. Please be informed that the following methods have not been validated for Mask RCNN model conversion in OpenVINO.

(1) Convert a trained Keras model into a ready-for-inference TensorFlow model, using keras_to_tensorflow tool, available from the following Github link. https://github.com/amir-abdi/keras_to_tensorflow

This tool has been validated for yolov3 model, and more information is available at the following link:

https://docs.openvinotoolkit.org/2020.4/omz_models_public_yolo_v3_tf_yolo_v3_tf.html

(2) Convert Keras model into the ONNX model format using keras2onnx model converter, available from the following Github link.

https://github.com/onnx/keras-onnx

Regards,

Munesh

DanielSerna · ‎07-23-2020

Thanks a lot Munesh @Munesh_Intel ,

I will try and get back to you...

thanks.

DanielSerna · ‎07-29-2020

Hi Munesh, @Munesh_Intel ,

I've tried all the methods you suggested but had many problems to convert the custom model with those methods...

Also I've tried to load and infer with a freezed .pb model and worked as expected, so I don't think it's a problem with the freezing of the model... I tried to convert to IR this freezed model that I verified it was working...

I've tried just passing the --input and --input_shape arguments and get the following errors:

python3 /opt/intel/openvino/deployment_tools/model_optimizer/mo_tf.py --input_model ../../logs/mask_rcnn_egg.pb --input input_image,input_image_meta,input_anchors --input_shape [1,1024,1024,3],[1,14],[1,261888,4]

[ ERROR ] Cannot infer shapes or values for node "roi_align_classifier/Where_3".
[ ERROR ] Input 0 of node roi_align_classifier/Where_3 was passed int32 from roi_align_classifier/Equal_3_port_0_ie_placeholder:0 incompatible with expected bool.
[ ERROR ]
[ ERROR ] It can happen due to bug in custom shape infer function <function tf_native_tf_node_infer at 0x7fdcf6963f28>.
[ ERROR ] Or because the node inputs have incorrect values/shapes.
[ ERROR ] Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ] Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ] Exception occurred during running replacer "REPLACEMENT_ID" (<class 'extensions.middle.PartialInfer.PartialInfer'>): Stopped shape/value propagation at "roi_align_classifier/Where_3" node.

If i try with the other arguments (pipeline.config and mask_rcnn_support_api_v1.14.json) I get the same errors I got before.

Please help me since I want to run this on a intel Neural compute stick, if I don't convert this I would have to investigate other options...

Munesh_Intel · ‎07-30-2020

Hi Daniel,

You are getting this error because “Where” is not a supported TensorFlow operation in OpenVINO 2020.1. However, it is supported in OpenVINO version 2020.4.

You can find the list of all supported TensorFlow operations here:

https://docs.openvinotoolkit.org/2020.4/openvino_docs_MO_DG_prepare_model_Supported_Frameworks_Layers.html#tensorflow_supported_operations

As such, I suggest you try optimizing your model using OpenVINO 2020.4, which has the latest features and gives leading performance.

Regards,

Munesh

DanielSerna · ‎07-30-2020

Hi Munesh @Munesh_Intel ,

thanks a lot for your reply,

I updated the openvino version to 2020.4 and ran again, now I'm getting this error...

python3 /opt/intel/openvino/deployment_tools/model_optimizer/mo_tf.py --input_model ../../logs/mask_rcnn_egg.pb --input input_image,input_image_meta,input_anchors --input_shape [1,1024,1024,3],[1,14],[1,261888,4]

[ ERROR ] Cannot infer shapes or values for node "roi_align_classifier/GatherNd_3".
[ ERROR ]
[ ERROR ]
[ ERROR ] It can happen due to bug in custom shape infer function <function GatherNd.infer at 0x7efc824a1378>.
[ ERROR ] Or because the node inputs have incorrect values/shapes.
[ ERROR ] Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ] Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ] Exception occurred during running replacer "REPLACEMENT_ID" (<class 'extensions.middle.PartialInfer.PartialInfer'>): Stopped shape/value propagation at "roi_align_classifier/GatherNd_3" node.

In the supported operations link you passed me, I see

GatherNd

Supported if it can be replaced with Gather

how could I do this ? maybe run with and older tf version ? I will be investigating how to do this while I wait for your reply, thanks.

DanielSerna · ‎07-30-2020

Hi again Munesh @Munesh_Intel,

I tried to convert the GatherNd with tf.gather with this function:

def my_gather_nd(params, indices):
idx_shape = tf.shape(indices)
params_shape = tf.shape(params)
idx_dims = idx_shape[-1]
gather_shape = params_shape[idx_dims:]
params_flat = tf.reshape(params, tf.concat([[-1], gather_shape], axis=0))
axis_step = tf.cumprod(params_shape[:idx_dims], exclusive=True, reverse=True)
indices_flat = tf.reduce_sum(indices * axis_step, axis=-1)
result_flat = tf.gather(params_flat, indices_flat)
return tf.reshape(result_flat, tf.concat([idx_shape[:-1], gather_shape], axis=0))

but when I try to freeze I get this error:

TypeError: Input 'y' of 'Mul' Op has type int32 that does not match type int64 of argument 'x' from this new function line:

indices_flat = tf.reduce_sum(indices * axis_step, axis=-1)

so I changed it so I cast the values, like this:

indices_flat = tf.reduce_sum(tf.cast(indices,tf.int64) * tf.cast(axis_step,tf.int64), axis=-1) or the same but with tf.int32. that allowed me to freeze the model, but when I try to convert it to the intermediate representation with the same command I get:

[ ERROR ] Cannot infer shapes or values for node "roi_align_classifier/Reshape_8".
[ ERROR ] Number of elements in input [4000 7 7 256] and output [1, 1000, 7, 7, 256] of reshape node roi_align_classifier/Reshape_8 mismatch
[ ERROR ]
[ ERROR ] It can happen due to bug in custom shape infer function <function Reshape.infer at 0x7f14acefcbf8>.
[ ERROR ] Or because the node inputs have incorrect values/shapes.
[ ERROR ] Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ] Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ] Exception occurred during running replacer "REPLACEMENT_ID" (<class 'extensions.middle.PartialInfer.PartialInfer'>): Stopped shape/value propagation at "roi_align_classifier/Reshape_8" node.

I see Reshape is supported, so what could be wrong...

Munesh_Intel · ‎07-31-2020

Hi Daniel,

I’ve validated all four Mask_RCNN models that is available in Open Model Zoo, and they are all working fine. Conversion parameters for these models are given as follows:

--framework

--data_type

--output_dir

--model_name

--reverse_input_channels

--input_shape=[1,800,1365,3]

--input=image_tensor

--transformations_config

--tensorflow_object_detection_api_pipeline_config

--input_model=frozen_inference_graph.pb

I suggest you try converting by adding these parameters to your command.

Regards,

Munesh

Munesh_Intel · ‎08-11-2020

Hi Daniel,

It’s been a while since we last heard from you.

If you need any additional information from Intel, please submit a new question.

Regards,

Munesh

DanielSerna · ‎08-11-2020

HI @Munesh_Intel,

first thanks for all the support and sorry I haven't respond, I kind of gave up and I'm trying different options, since I need to advance in this because I felt kind of stuck, if you tell me the exact command line to try I would be open to give it a try, is just that is different to convert the open model zoo models than converting a custom trained model based on these models.

please let me know if we can retake and see if we can finally solve this.

DanielSerna · ‎08-13-2020

Hi @Munesh_Intel ,

anything new we could try ? or maybe another support channel ? even if it is a paid one ? we really want to make some advance in this.

Thank you !

Munesh_Intel · ‎08-14-2020

Hi Daniel,

I suggest you try an alternative method as follows:

(1) Convert Keras model into the ONNX model format using keras2onnx model converter, available from the following Github link.

https://github.com/onnx/keras-onnx

(2) Subsequently, use the function InferenceEngine::Core::ReadNetwork method to read ONNX models via the Inference Engine Core API.

(For your information, Inference Engine enables reading ONNX models via the Inference Engine Core API since OpenVINO™ 2020.4 version)

More information is available at the following page:

https://docs.openvinotoolkit.org/2020.4/classInferenceEngine_1_1Core.html#ac716dda382aefd09264b60ea40def3ef

Regards,

Munesh

DanielSerna · ‎08-14-2020

Hi @Munesh_Intel,

thanks for your response, it's been helpful to get back on the track, the problem with the onnx method was that before I couldn't even convert to .onnx, now checking it again I saw a script in the nightly version of keras2onnx that helped convert maskrcnn to .onxx, and it worked !!

So I will try what you recommended about loading directly to the IE the onnx model, but I also wanted to check if now with this new model I could convert to the IR. so I tried with:

'''python3 /opt/intel/openvino/deployment_tools/model_optimizer/mo.py --input_model ./mrcnn.onnx --input input_image,input_image_meta,input_anchors --input_shape [1,1024,1024,3],[1,14],[1,261888,4]'''

but got this error, maybe it's easier to fix than the previous method we were trying, so while I try to load .onnx directly to IE I tought I could write you to see if you have a solution for this one:

Model Optimizer arguments:
Common parameters:
- Path to the Input Model: /home/tensorbook/Documents/keras-onnx/applications/mask_rcnn/./mrcnn.onnx
- Path for generated IR: /home/tensorbook/Documents/keras-onnx/applications/mask_rcnn/.
- IR output name: mrcnn
- Log level: ERROR
- Batch: Not specified, inherited from the model
- Input layers: input_image,input_image_meta,input_anchors
- Output layers: Not specified, inherited from the model
- Input shapes: [1,1024,1024,3],[1,14],[1,261888,4]
- Mean values: Not specified
- Scale values: Not specified
- Scale factor: Not specified
- Precision of IR: FP32
- Enable fusing: True
- Enable grouped convolutions fusing: True
- Move mean values to preprocess section: False
- Reverse input channels: False
ONNX specific parameters:
Model Optimizer version:
[ ERROR ] Exception occurred during running replacer "REPLACEMENT_ID" (<class 'extensions.load.onnx.loader.ONNXLoader'>): Unexpected exception happened during extracting attributes for node Resize2.
Original exception message: ONNX Resize operation from opset 11 is not supported.

Also for the loading of the onnx model directly on IE is only available on C++, or also for python ? and maybe do you have an example other than the documentation ?

thanks a lot for all the help I have received from you.

Munesh_Intel · ‎08-17-2020

Hi Daniel,

Glad to hear on the progress you’ve made.

Regarding the error that you are getting, “Resize” operator is only supported for Opset-10 version in OpenVINO. Opset-11 version is not supported.

More information is available at the following page:

https://docs.openvinotoolkit.org/2020.4/openvino_docs_MO_DG_prepare_model_Supported_Frameworks_Layers.html#onnx_supported_operators

Thus, you can try implementing the “Resize” operator in Opset-10 version.

Alternatively, you can also try converting ONNX model by using the ONNX Version Converter.

https://github.com/onnx/onnx/blob/master/docs/VersionConverter.md

And lastly, reading models directly from ONNX format is only supported by Inference Engine C++ API.

Regards,

Munesh

Munesh_Intel · ‎08-27-2020

Hi Daniel,

We haven’t heard from you for some time. If you need any additional information from Intel, please submit a new question as this thread is no longer being monitored.

Regards,

Munesh