I would like to run model optimizer on a tensorflow model that contains a CTC layer.
I tried this 2 layers:
But in both case I got the following output:
[ ERROR ] Cannot infer shapes or values for node "StatefulPartitionedCall/model_1/plate2_/CTCGreedyDecoder". [ ERROR ] Batch dimensions of input tensors must be the same for StatefulPartitionedCall/model_1/plate2_/CTCGreedyDecoder node [ ERROR ] [ ERROR ] It can happen due to bug in custom shape infer function <function CTCGreedyDecoderSeqLenOp.infer at 0x000002892C9B9C18>. [ ERROR ] Or because the node inputs have incorrect values/shapes. [ ERROR ] Or because input shapes are incorrect (embedded to the model or passed via --input_shape). [ ERROR ] Run Model Optimizer with --log_level=DEBUG for more information. [ ERROR ] Exception occurred during running replacer "REPLACEMENT_ID" (<class 'extensions.middle.PartialInfer.PartialInfer'>): Stopped shape/value propagation at "StatefulPartitionedCall/model_1/plate2_/CTCGreedyDecoder" node. For more information please refer to Model Optimizer FAQ, question #38. (https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_prepare_model_Model_Optimizer_FAQ.html?question=38#question-38)
I tried with and without explicit input shape definition, same result.
Any idea how can I model optimize such model?
Thanks in advance
To be specific I'm using the following custom layer based on the ctc:
def call(self, y_pred): batch_size = tf.shape(y_pred) input_shape = tf.shape(y_pred) input_lengths = tf.ones(shape=input_shape) * tf.cast(self.input_length, tf.float32) plate = tf.nn.ctc_greedy_decoder(tf.transpose(y_pred, (1, 0, 2)), tf.cast(input_lengths, tf.int32), merge_repeated=True) plate = tf.sparse.to_dense(plate, default_value=-1) plate = tf.reshape(plate, [batch_size, self.input_length]) plate = tf.cast(plate + 1, tf.float32, name=self.name+'text') #must add 1, because ctc returns nonchars as -1 confs = tf.reduce_prod(tf.reduce_max(y_pred, axis=2), axis=1, name=self.name+'conf') confs = tf.reshape(confs, [batch_size, 1]) return [confs, plate]
If it's possible, can you share your working model so that we could test it out?
Btw, please note that there are numbers of model's topologies supported by OpenVINO
You may refer to this official documentation and see whether your topology is listed: https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_prepare_model_convert_model_Convert_Mode...
If it's not, then it is not supported.
This pre-trained model contains the CTC Greedy Decoder: https://docs.openvinotoolkit.org/latest/omz_models_model_text_recognition_0012.html
You can check it out and probably take it as a template.
Thank you for your attention.
Im not sure I'm allowed to share the model, if not I will create a minimal dumb model to reprodcue to issue and share that.
About the text_recognition model:
I'm a little confused, I saw this model earlier, however it says:
"The network output can be decoded by CTC Greedy Decoder"
So is the CTC Decoder included inside the network or is it applied after?
The CTCGreedyDecoder performs greedy decoding on the logics given in input (best path) in Tensorflow and it works best from our pre-trained model text_recognition_0012 as it was developed using source framework of Tensorflow - text-recognition-0012 - OpenVINO™ Toolkit (openvinotoolkit.org)
CTC Greedy Decoder is an available tool for decoding purposes, thus it shall be applied after using the network output itself.
For example, when using Text Detection C++ Demo with -b parameter on text-recognition-0012 model -
-b Optional. Bandwidth for CTC beam search decoder. The default value is 0, in this case, CTC greedy decoder will be used.