Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

I am interested to know if anyone has tried finetuning a pre-trained model in TensorFlow and used them in an NCS?

SPaul19
Innovator
1,586 Views

I am trying the above for a VGG16 model. But there are unsupported operations that are preventing me from doing this. Here's the notebook where one can see how I am creating the model and necessary files for generating the NCS graph. Help would be appreciated. I am using v2 of NCSDK, btw.

Here's the full error stack:

 

sayak@sayak-VirtualBox:~/workspace/Fine-tune and then use on NCS for inference$ mvNCCompile TF_Model/tf_model.meta -in=input_1 -on=dense_2/Softmax

/usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 88 from C header, got 96 from PyObject

 return f(*args, **kwds)

/usr/local/bin/ncsdk/Controllers/Parsers/TensorFlowParser/Convolution.py:47: SyntaxWarning: assertion is always true, perhaps remove parentheses?

 assert(False, "Layer type not supported by Convolution: " + obj.type)

mvNCCompile v02.00, Copyright @ Intel Corporation 2017

 

****** Info: No Weights provided. inferred path: TF_Model/tf_model.data-00000-of-00001******

TF_Model/tf_model.meta

2019-06-18 16:40:17.142256: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2

[Error 4] Toolkit Error: Stage Type Not Supported: StopGradient

 

0 Kudos
1 Solution
SPaul19
Innovator
1,586 Views

Hello Shubha. Finally I could figure this out and I have successfully employed my NCS to infer using a fine-tuned model in TensorFlow (not downloaded from TF ZOO). Here is my workflow:

- I use keras extensively and keras is kind of my home ground. First I fine-tuned a VGG16 (which was trained on ImageNet) network on the CIFAR10 dataset.

- After training and testing the model the next step was to create the .pb file which is the main ingredient when we are employing OpenVINO to optimize a TF model. In order to get that file, I first needed to convert my keras model to a native TensorFlow session. Using that session, I generated the .pb file using the instructions as mentioned on Intel's site and freezed the model. 

- The next step was to employ the mo_tf.py file to optimizer the frozen TF model. This is where I was going wrong. I needed to specify the input_shape argument in there. Wehn you are using a TF file which was derived from keras, the functions may not be able to infer the shapes. This is why this had to be specified explicitly. As I was going to run this on my NCS, I specified the data_type to be FP16. 

- After mo_tf.py generated me the .xml and .bin files, I could finally use the Inference Engine workflow to test the inference. 

 

I hope this will help a lot of developers who might face the same problem. I will shortly convert this to a full-fledged blog :)

View solution in original post

0 Kudos
9 Replies
Shubha_R_Intel
Employee
1,586 Views

Dear Sayak

NCSDK is no longer supported. Please upgrade your code to OpenVino - all NCS functions are now handled within OpenVino. In fact we even recently open sourced the VPU hardware plugin https://github.com/opencv/dldt in the open-source version of OpenVino.

Hope it helps.

thanks,

Shubha

0 Kudos
SPaul19
Innovator
1,586 Views

Hi Shubha. Thanks for your reply. Would you be able to link me any tutorials/videos which demonstrates this refactoring? 

0 Kudos
SPaul19
Innovator
1,586 Views

Hi Shubha,

Thank you very much for all your help. I have been playing with OpenVINO now and the experience has been great so far. The documentation is so well written and explained as well. However, I did not find anything substantial aiding transfer learning really. Let me explain the entire flow I am following. 

Problem statement: Build an image classification model to detect images from the CIFAR10 dataset. 

I am using a VGG16 model which was trained on the ImageNet dataset and then I am fine-tuning it to support my problem. I am using Keras for this and then I am converting it to TF graphs files which are compatible with NCS (not 2). Then I am following the instructions as specified over here. I get the following stack of error:

 

mo_tf.py --input_meta_graph tf_model.meta --data_type FP16
Model Optimizer arguments:
Common parameters:
    - Path to the Input Model:     None
    - Path for generated IR:     /home/sayak/workspace/Fine_Tune_Infer_NCS/TF_Model/.
    - IR output name:     tf_model
    - Log level:     ERROR
    - Batch:     Not specified, inherited from the model
    - Input layers:     Not specified, inherited from the model
    - Output layers:     Not specified, inherited from the model
    - Input shapes:     Not specified, inherited from the model
    - Mean values:     Not specified
    - Scale values:     Not specified
    - Scale factor:     Not specified
    - Precision of IR:     FP16
    - Enable fusing:     True
    - Enable grouped convolutions fusing:     True
    - Move mean values to preprocess section:     False
    - Reverse input channels:     False
TensorFlow specific parameters:
    - Input model in text protobuf format:     False
    - Path to model dump for TensorBoard:     None
    - List of shared libraries with TensorFlow custom layers implementation:     None
    - Update the configuration file with input/output node names:     None
    - Use configuration file used to generate the model with Object Detection API:     None
    - Operations to offload:     None
    - Patterns to offload:     None
    - Use the config file:     None
Model Optimizer version:     2019.1.1-83-g28dfbfd
/usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 88 from C header, got 96 from PyObject
  return f(*args, **kwds)
[ FRAMEWORK ERROR ]  Cannot load input model: Error while reading resource variable total from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/total/N10tensorflow3VarE does not exist.
     [[{{node total/Read/ReadVariableOp}} = ReadVariableOp[dtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](total)]]

Caused by op 'total/Read/ReadVariableOp', defined at:
  File "/opt/intel/openvino_2019.1.144/deployment_tools/model_optimizer/mo_tf.py", line 31, in <module>
    sys.exit(main(get_tf_cli_parser(), 'tf'))
  File "/opt/intel/openvino_2019.1.144/deployment_tools/model_optimizer/mo/main.py", line 312, in main
    return driver(argv)
  File "/opt/intel/openvino_2019.1.144/deployment_tools/model_optimizer/mo/main.py", line 263, in driver
    is_binary=not argv.input_model_is_text)
  File "/opt/intel/openvino_2019.1.144/deployment_tools/model_optimizer/mo/pipeline/tf.py", line 81, in tf2nx
    saved_model_tags=argv.saved_model_tags)
  File "/opt/intel/openvino_2019.1.144/deployment_tools/model_optimizer/mo/front/tf/loader.py", line 217, in load_tf_graph_def
    restorer = tf.train.import_meta_graph(input_meta_graph_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1666, in import_meta_graph
    meta_graph_or_file, clear_devices, import_scope, **kwargs)[0]
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1688, in _import_meta_graph_with_return_elements
    **kwargs))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/meta_graph.py", line 806, in import_scoped_meta_graph_with_return_elements
    return_elements=return_elements)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def
    _ProcessNewOps(graph)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 234, in _ProcessNewOps
    for new_op in graph._add_new_tf_operations(compute_devices=False):  # pylint: disable=protected-access
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3438, in _add_new_tf_operations
    for c_op in c_api_util.new_tf_operations(self)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3438, in <listcomp>
    for c_op in c_api_util.new_tf_operations(self)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3297, in _create_op_from_tf_operation
    ret = Operation(c_op, self)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1768, in __init__
    self._traceback = tf_stack.extract_stack()

FailedPreconditionError (see above for traceback): Error while reading resource variable total from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/total/N10tensorflow3VarE does not exist.
     [[{{node total/Read/ReadVariableOp}} = ReadVariableOp[dtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](total)]

 

According to my exploration, I could not really find any good material on this topic. The notebook which I used to generate the TF files is specified in the question itself. Requesting for your assistance. 

0 Kudos
Shubha_R_Intel
Employee
1,586 Views

Dear Sayak Paul,

Did you freeze your tensorflow model first ?  See the below document.

http://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_TensorFlow.html#freeze-the-tensorflow-model

Or did you pick from the already frozen Supported Tensorflow List ? VGG16 is definitely on this list.

Please report back here and I will help you.

Thanks,

Shubha

0 Kudos
SPaul19
Innovator
1,586 Views

Hi Shubha. I did that too. Here's the error stack now:

I am running this mo_tf.py --input_model inference_graph.pb --data_type FP16 

Output:
Model Optimizer arguments:
Common parameters:
    - Path to the Input Model:     /home/sayak/workspace/Fine_Tune_Infer_NCS/TF_Model/inference_graph.pb
    - Path for generated IR:     /home/sayak/workspace/Fine_Tune_Infer_NCS/TF_Model/.
    - IR output name:     inference_graph
    - Log level:     ERROR
    - Batch:     Not specified, inherited from the model
    - Input layers:     Not specified, inherited from the model
    - Output layers:     Not specified, inherited from the model
    - Input shapes:     Not specified, inherited from the model
    - Mean values:     Not specified
    - Scale values:     Not specified
    - Scale factor:     Not specified
    - Precision of IR:     FP16
    - Enable fusing:     True
    - Enable grouped convolutions fusing:     True
    - Move mean values to preprocess section:     False
    - Reverse input channels:     False
TensorFlow specific parameters:
    - Input model in text protobuf format:     False
    - Path to model dump for TensorBoard:     None
    - List of shared libraries with TensorFlow custom layers implementation:     None
    - Update the configuration file with input/output node names:     None
    - Use configuration file used to generate the model with Object Detection API:     None
    - Operations to offload:     None
    - Patterns to offload:     None
    - Use the config file:     None
Model Optimizer version:     2019.1.1-83-g28dfbfd
/usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 88 from C header, got 96 from PyObject
  return f(*args, **kwds)
[ ERROR ]  Shape [-1 48 48  3] is not fully defined for output 0 of "input_1". Use --input_shape with positive integers to override model input shapes.
[ ERROR ]  Cannot infer shapes or values for node "input_1".
[ ERROR ]  Not all output shapes were inferred or fully defined for node "input_1". 
 For more information please refer to Model Optimizer FAQ (<INSTALL_DIR>/deployment_tools/documentation/docs/MO_FAQ.html), question #40. 
[ ERROR ]  
[ ERROR ]  It can happen due to bug in custom shape infer function <function tf_placeholder_ext.<locals>.<lambda> at 0x7f9fdcd23400>.
[ ERROR ]  Or because the node inputs have incorrect values/shapes.
[ ERROR ]  Or because input shapes are incorrect (embedded to the model or passed via --input_shape).
[ ERROR ]  Run Model Optimizer with --log_level=DEBUG for more information.
[ ERROR ]  Exception occurred during running replacer "REPLACEMENT_ID" (<class 'extensions.middle.PartialInfer.PartialInfer'>): Stopped shape/value propagation at "input_1" node. 
 For more information please refer to Model Optimizer FAQ (<INSTALL_DIR>/deployment_tools/documentation/docs/MO_FAQ.html), question #38.

Here's the updated notebook which freezes the TF model as per the instructions specified. 

0 Kudos
SPaul19
Innovator
1,586 Views

Hi Shubha,

I took a simple CNN (not a pretrained one) and tried it with OpenVINO to get the necessary files for inference. Eve that fails and here's the full error trace:

I am running this comment as usual (have tried with --data_type FP16 flag as well and the result is still the same)

mo_tf.py --input_meta_graph TF_Model/tf_model.meta

Trace: 
Model Optimizer arguments:
Common parameters:
    - Path to the Input Model:     None
    - Path for generated IR:     /home/sayak/workspace/NCS + TensorFlow/.
    - IR output name:     tf_model
    - Log level:     ERROR
    - Batch:     Not specified, inherited from the model
    - Input layers:     Not specified, inherited from the model
    - Output layers:     Not specified, inherited from the model
    - Input shapes:     Not specified, inherited from the model
    - Mean values:     Not specified
    - Scale values:     Not specified
    - Scale factor:     Not specified
    - Precision of IR:     FP32
    - Enable fusing:     True
    - Enable grouped convolutions fusing:     True
    - Move mean values to preprocess section:     False
    - Reverse input channels:     False
TensorFlow specific parameters:
    - Input model in text protobuf format:     False
    - Path to model dump for TensorBoard:     None
    - List of shared libraries with TensorFlow custom layers implementation:     None
    - Update the configuration file with input/output node names:     None
    - Use configuration file used to generate the model with Object Detection API:     None
    - Operations to offload:     None
    - Patterns to offload:     None
    - Use the config file:     None
Model Optimizer version:     2019.1.1-83-g28dfbfd
/usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 88 from C header, got 96 from PyObject
  return f(*args, **kwds)
[ FRAMEWORK ERROR ]  Cannot load input model: Error while reading resource variable total from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/total/N10tensorflow3VarE does not exist.
     [[{{node total/Read/ReadVariableOp}} = ReadVariableOp[dtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](total)]]

Caused by op 'total/Read/ReadVariableOp', defined at:
  File "/opt/intel/openvino_2019.1.144/deployment_tools/model_optimizer/mo_tf.py", line 31, in <module>
    sys.exit(main(get_tf_cli_parser(), 'tf'))
  File "/opt/intel/openvino_2019.1.144/deployment_tools/model_optimizer/mo/main.py", line 312, in main
    return driver(argv)
  File "/opt/intel/openvino_2019.1.144/deployment_tools/model_optimizer/mo/main.py", line 263, in driver
    is_binary=not argv.input_model_is_text)
  File "/opt/intel/openvino_2019.1.144/deployment_tools/model_optimizer/mo/pipeline/tf.py", line 81, in tf2nx
    saved_model_tags=argv.saved_model_tags)
  File "/opt/intel/openvino_2019.1.144/deployment_tools/model_optimizer/mo/front/tf/loader.py", line 217, in load_tf_graph_def
    restorer = tf.train.import_meta_graph(input_meta_graph_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1666, in import_meta_graph
    meta_graph_or_file, clear_devices, import_scope, **kwargs)[0]
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1688, in _import_meta_graph_with_return_elements
    **kwargs))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/meta_graph.py", line 806, in import_scoped_meta_graph_with_return_elements
    return_elements=return_elements)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def
    _ProcessNewOps(graph)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 234, in _ProcessNewOps
    for new_op in graph._add_new_tf_operations(compute_devices=False):  # pylint: disable=protected-access
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3438, in _add_new_tf_operations
    for c_op in c_api_util.new_tf_operations(self)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3438, in <listcomp>
    for c_op in c_api_util.new_tf_operations(self)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3297, in _create_op_from_tf_operation
    ret = Operation(c_op, self)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1768, in __init__
    self._traceback = tf_stack.extract_stack()

FailedPreconditionError (see above for traceback): Error while reading resource variable total from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/total/N10tensorflow3VarE does not exist.
     [[{{node total/Read/ReadVariableOp}} = ReadVariableOp[dtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](total)]]

 

Here you can find the notebook which builds me the model and necessary files. As this is not a custom model, I think I am not supposed to freeze anything. Hence, I am following the instructions (from a metafile) specified over here: http://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_TensorFlow.html#Convert_From_TF

Requesting you to please run the files otherwise, you won't know if I am correct in the process. The notebooks are well commented so following them should not be an issue. 

0 Kudos
SPaul19
Innovator
1,587 Views

Hello Shubha. Finally I could figure this out and I have successfully employed my NCS to infer using a fine-tuned model in TensorFlow (not downloaded from TF ZOO). Here is my workflow:

- I use keras extensively and keras is kind of my home ground. First I fine-tuned a VGG16 (which was trained on ImageNet) network on the CIFAR10 dataset.

- After training and testing the model the next step was to create the .pb file which is the main ingredient when we are employing OpenVINO to optimize a TF model. In order to get that file, I first needed to convert my keras model to a native TensorFlow session. Using that session, I generated the .pb file using the instructions as mentioned on Intel's site and freezed the model. 

- The next step was to employ the mo_tf.py file to optimizer the frozen TF model. This is where I was going wrong. I needed to specify the input_shape argument in there. Wehn you are using a TF file which was derived from keras, the functions may not be able to infer the shapes. This is why this had to be specified explicitly. As I was going to run this on my NCS, I specified the data_type to be FP16. 

- After mo_tf.py generated me the .xml and .bin files, I could finally use the Inference Engine workflow to test the inference. 

 

I hope this will help a lot of developers who might face the same problem. I will shortly convert this to a full-fledged blog :)

0 Kudos
Shubha_R_Intel
Employee
1,586 Views

Dear  Sayak,

Thanks for reporting back your success to the OpenVino community ! We definitely appreciate that ! And please share your blogpost to the community when you are done.

I'm very glad that you have pieced everything together and got it working.

Shubha

0 Kudos
Reply