Re: Cannot infer shape of CTCGreedyDecoder

Nemes__Adam · ‎04-23-2021

I would like to run model optimizer on a tensorflow model that contains a CTC layer.

I tried this 2 layers:

https://www.tensorflow.org/api_docs/python/tf/nn/ctc_greedy_decoder

http://tensorflow.biotecan.com/python/Python_1.8/tensorflow.google.cn/api_docs/python/tf/keras/backend/ctc_decode.html

But in both case I got the following output:

[ ERROR ]  Cannot infer shapes or values for node "StatefulPartitionedCall/model_1/plate2_/CTCGreedyDecoder".
[ ERROR ]  Batch dimensions of input tensors must be the same for StatefulPartitionedCall/model_1/plate2_/CTCGreedyDecoder node
[ ERROR ]
[ ERROR ]  It can happen due to bug in custom shape infer function <function CTCGreedyDecoderSeqLenOp.infer at 0x000002892C9B9C18>.
[ ERROR ]  Or because the node inputs have incorrect values/shapes.
[ ERROR ]  Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ]  Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ]  Exception occurred during running replacer "REPLACEMENT_ID" (<class 'extensions.middle.PartialInfer.PartialInfer'>): Stopped shape/value propagation at "StatefulPartitionedCall/model_1/plate2_/CTCGreedyDecoder" node.
 For more information please refer to Model Optimizer FAQ, question #38. (https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_prepare_model_Model_Optimizer_FAQ.html?question=38#question-38)

I tried with and without explicit input shape definition, same result.

Any idea how can I model optimize such model?

Thanks in advance

Nemes__Adam · ‎04-23-2021

To be specific I'm using the following custom layer based on the ctc:

 def call(self, y_pred):
    batch_size = tf.shape(y_pred)[0]
    input_shape = tf.shape(y_pred)
    input_lengths = tf.ones(shape=input_shape[0]) * tf.cast(self.input_length, tf.float32)
    plate = tf.nn.ctc_greedy_decoder(tf.transpose(y_pred, (1, 0, 2)),
                                        tf.cast(input_lengths, tf.int32),
                                        merge_repeated=True)[0][0]
    plate = tf.sparse.to_dense(plate, default_value=-1)
    plate = tf.reshape(plate, [batch_size, self.input_length])
    plate = tf.cast(plate + 1, tf.float32, name=self.name+'text') #must add 1, because ctc returns nonchars as -1
    confs = tf.reduce_prod(tf.reduce_max(y_pred, axis=2),
                            axis=1,
                            name=self.name+'conf')
    confs = tf.reshape(confs, [batch_size, 1])
    return [confs, plate]

Iffa_Intel · ‎04-26-2021

Greetings,

If it's possible, can you share your working model so that we could test it out?

Btw, please note that there are numbers of model's topologies supported by OpenVINO

You may refer to this official documentation and see whether your topology is listed: https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_TensorFlow.html

If it's not, then it is not supported.

This pre-trained model contains the CTC Greedy Decoder: https://docs.openvinotoolkit.org/latest/omz_models_model_text_recognition_0012.html

You can check it out and probably take it as a template.

Sincerely,

Iffa

Nemes__Adam · ‎04-27-2021

Thank you for your attention.

Im not sure I'm allowed to share the model, if not I will create a minimal dumb model to reprodcue to issue and share that.

About the text_recognition model:

I'm a little confused, I saw this model earlier, however it says:

"The network output can be decoded by CTC Greedy Decoder"

So is the CTC Decoder included inside the network or is it applied after?

Sincerely

Adam

Iffa_Intel · ‎04-28-2021

Greetings,

The CTCGreedyDecoder performs greedy decoding on the logics given in input (best path) in Tensorflow and it works best from our pre-trained model text_recognition_0012 as it was developed using source framework of Tensorflow - text-recognition-0012 - OpenVINO™ Toolkit (openvinotoolkit.org)

CTC Greedy Decoder is an available tool for decoding purposes, thus it shall be applied after using the network output itself.

For example, when using Text Detection C++ Demo with -b parameter on text-recognition-0012 model -

https://docs.openvinotoolkit.org/latest/omz_demos_text_detection_demo_cpp.html

-b Optional. Bandwidth for CTC beam search decoder. The default value is 0, in this case, CTC greedy decoder will be used.

Sincerely,

Iffa

Iffa_Intel · ‎05-11-2021

Greetings,

Intel will no longer monitor this thread since we have provided a solution. If you need any additional information from Intel, please submit a new question

Sincerely,

Iffa