Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6403 Discussions

Conversion of frozen TensorFlow Graph to Movidius Graph

idata
Employee
1,669 Views

I'm getting the following errors when trying to convert a frozen model .pb file using mvNCCompile command in Ubuntu 16.04, Tensorflow 1.7 and the MNCS SDK 2.04

 

The first error is:

 

Executor failed to create kernel. Invalid argument: NodeDef mentions attr 'dilations' not in Op output:T; attr=T:type,allowed=[DT_HALF, DT_FLOAT]; attr=strides:list(int); attr=use_cudnn_on_gpu:bool,default=true; attr=padding:string,allowed=["SAME", "VALID"]; attr=data_format:string,default="NHWC",allowed=["NHWC", "NCHW"]>;

 

The model is a GAN trained on a GPU then saved as a frozen model using a Tensorflow CPU install.

 

Graph files are here:

 

https://drive.google.com/drive/folders/1_v-XhhclGhbrrfVGM7Q0JiQYDDmrO_aP?usp=sharing

 

Full stacktrace attached.

0 Kudos
11 Replies
idata
Employee
1,085 Views

I thought it could be due to GPU specific instructions so I trained the model on TF CPU instead and got the same errors.

 

CPU graph files:

 

https://drive.google.com/drive/folders/1JuOM7yh_9pxaM_kt2N4IFED0lwpH3kL4?usp=sharing

 

This is a link to the GAN code:

 

https://github.com/andrewginns/CycleGAN-Tensorflow-PyTorch
0 Kudos
idata
Employee
1,085 Views

Tried again with TF 1.6 CPU and python 2.7 to train the network. Same error as before.

0 Kudos
idata
Employee
1,085 Views

I managed to fix the previous errors by adding some code to my freeze_graph.py to strip attributes

 

for node in output_graph_def.node: if node.op == 'RefSwitch': node.op = 'Switch' for index in xrange(len(node.input)): if 'moving_' in node.input[index]: node.input[index] = node.input[index] + '/read' elif node.op == 'AssignSub': node.op = 'Sub' if 'use_locking' in node.attr: del node.attr['use_locking'] if "dilations" in node.attr: del node.attr["dilations"] if "index_type" in node.attr: del node.attr["index_type"]

 

However I'm now getting:

 

if d.decorator_argspec is not None), _inspect.getargspec(target)) [Error 5] Toolkit Error: Stage Details Not Supported: FusedBatchNorm inputs mean and variance are not defined. The graph is not created for inference.

 

I'm assuming that I need to convert the graph for inference using the TF Graph Transform Tool like in this thread. https://ncsforum.movidius.com/discussion/590/indexerror-list-index-out-of-range-trying-to-compile-tf-model

 

Though I'm a little unclear how my mean and variance for the FusedBatchNorm should be defined.

0 Kudos
idata
Employee
1,085 Views

Reverted to using the official freeze_graph instructions and transform_graph using bazel from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/graph_transforms/README.md

 

Ubuntu 16.04, python 2.7, TF 1.6 CPU, MNCS SDK 2.04

 

Added some code to the standard freeze_graph.py to try to account for the following errors

 

1) moving average error:

 

ValueError: graph_def is invalid at node 'a2b_generator/Conv/BatchNorm/AssignMovingAvg': Input tensor 'a2b_generator/Conv/BatchNorm/moving_mean:0' Cannot convert a tensor of type float32 to an input of type float32_ref.

 

2) dilation error:

 

NodeDef mentions attr 'dilations' not in Op<name=Conv2D; signature=input:T, filter:T -> output:T;

 

However it seems like the dilation node.attr removal isn't working because mvNCCompile still returns the original error as in post 1. Neither the bazel version of freeze_graph or my simple_freeze_graph works.

 

freeze_graph.py modifications:

 

#Fix node name errors for node in output_graph_def.node: if node.op == 'RefSwitch': node.op = 'Switch' for index in xrange(len(node.input)): if 'moving_' in node.input[index]: node.input[index] = node.input[index] + '/read' elif node.op == 'AssignSub': node.op = 'Sub' if 'use_locking' in node.attr: del node.attr['use_locking'] if "index_type" in node.attr: del node.attr["index_type"] if "dilations" in node.attr: del node.attr["dilations"] print("Removed attr 'dilation'")

 

My freeze_graph command is:

 

bazel-bin/tensorflow/python/tools/freeze_graph \ --input_graph=graph.pb \ --input_checkpoint="Epoch_(0)_(100of962).ckpt" \ --output_graph=/tmp/frozen_graph.pb --output_node_names=a2b_generator/Tanh

 

My transform_graph command is:

 

bazel-bin/tensorflow/tools/graph_transforms/transform_graph \ --in_graph=/tmp/frozen_graph.pb \ --out_graph=/tmp/optimized_graph.pb \ --inputs='Placeholder' \ --outputs='a2b_generator/Tanh' \ --transforms=' strip_unused_nodes(type=float, shape="1,299,299,3") remove_nodes(op=Identity, op=CheckNumerics) fold_constants(ignore_errors=true) fold_batch_norms'

 

My mvNCCompile command is:

 

mvNCCompile /tmp/optimized_graph.pb -in Placeholder -on a2b_generator/Tanh

 

All files here: https://drive.google.com/drive/folders/1QKptbWQPqS974bcSfTo_rFYAbuLkhLIt?usp=sharing

 

-graph.pb is the GrafDef proto

 

-frozen_graph is the output from the freeze_graph

 

-optimised_graph is the output from the transform_graph and input to the mvNCCompile command
0 Kudos
idata
Employee
1,085 Views

@ginnsandrew At the moment, the NCSDK doesn't support Generative Adversarial Networks.

0 Kudos
idata
Employee
1,085 Views

@Tome_at_Intel For all intents and purposes a GAN is just a way of training a convolution network.

 

Is the error I'm getting specific to the use of a GAN? During inference the network should just look like a convolution net. As far as I can tell the error I'm getting is due to a mismatch between the TF versions in training and inference. Does the MNC SDK use something other than python 2.7 and TF 1.6?

0 Kudos
idata
Employee
1,085 Views

@Tome_at_Intel

 

So it turns out the previous error was caused by the use of a different TF version when freezing and transforming my graph file. Using TF 1.6 for the freeze_graph and transform_graph fixed it.

 

I now get a new error:

 

mvNCCompile /tmp/optimized_graph.pb -in Placeholder -on a2b_generator/Tanh mvNCCompile v02.00, Copyright @ Intel Corporation 2017 /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py:871: DeprecationWarning: builtin type EagerTensor has no __module__ attribute /usr/local/lib/python3.5/dist-packages/tensorflow/python/util/tf_inspect.py:45: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead /usr/local/lib/python3.5/dist-packages/tensorflow/python/util/tf_inspect.py:45: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead shape: [1, 256, 256, 3] res.shape: (1, 256, 256, 3) TensorFlow output shape: (256, 256, 3) Traceback (most recent call last): File "/usr/local/bin/mvNCCompile", line 156, in <module> create_graph(args.network, args.inputnode, args.outputnode, args.outfile, args.nshaves, args.inputsize, args.weights, args.explicit_concat, args.ma2480, args.scheduler, args) File "/usr/local/bin/mvNCCompile", line 137, in create_graph load_ret = load_network(args, parser, myriad_config) File "/usr/local/bin/ncsdk/Controllers/Scheduler.py", line 95, in load_network network.optimize() File "/usr/local/bin/ncsdk/Models/Network.py", line 250, in optimize self.convert_network_input_to_yxz() File "/usr/local/bin/ncsdk/Models/Network.py", line 337, in convert_network_input_to_yxz if self.stageslist[0].op in [StageType.fully_connected_layer, StageType.convolution, StageType.max_pooling, IndexError: list index out of range
0 Kudos
idata
Employee
1,085 Views

@ginnsandrew Apologies, I meant that we don't have a GAN example for the NCSDK at the moment. For Python, the NCSDK can be used with Python 3.5 also. I am looking into your issue and I'll get back to you as soon as I find something. Thanks.

0 Kudos
idata
Employee
1,085 Views

@Tome_at_Intel Thanks. My latest files are here: https://drive.google.com/drive/folders/1U_sw-P-qYZ4ACtso5HqI0thcmbCmOa1H?usp=sharing

 

graph.pb - GrafDef proto

 

frozen_graph.pb - Output from freeze_graph

 

optimised_graph.pb - Output from transform_graph

 

Python 2.7.12, TF 1.6, Bazel 0.11.0, MNC SDK 2.04, Ubuntu 16.04.4

 

Commands used:

 

bazel-bin/tensorflow/python/tools/freeze_graph \ --input_graph=graph.pb \ --input_checkpoint="Epoch_(0)_(100of962).ckpt" \ --output_graph=/tmp/frozen_graph.pb --output_node_names=a2b_generator/Tanh bazel-bin/tensorflow/tools/graph_transforms/transform_graph \ --in_graph=/tmp/frozen_graph.pb \ --out_graph=/tmp/optimized_graph.pb \ --inputs='Placeholder' \ --outputs='a2b_generator/Tanh' \ --transforms=' strip_unused_nodes(type=float, shape="1,299,299,3") remove_nodes(op=Identity, op=CheckNumerics) fold_constants(ignore_errors=true) fold_batch_norms' mvNCCompile /tmp/optimized_graph.pb -in Placeholder -on a2b_generator/Tanh
0 Kudos
idata
Employee
1,085 Views

@ginnsandrew Just wanted to give you an update. It looks like while parsing the graph file for the model, the NCSDK was not able to find any of the ops. Not sure why this is happening because I know for a fact that we do support some of these ops, however while debugging the model, I tried printing out the nodes from the stageslist list inside of Network.py and it was empty. That's why you receive a list index out of range error. I found this to be strange because when I used a separate script to read and print the nodes from the model, they were all there.

0 Kudos
idata
Employee
1,085 Views

@Tome_at_Intel Thanks for looking into it, I really appreciate it. I actually think it was a problem with the way I was saving the graphs. For some reason the standard freeze_graph tools don't seem to work with graphs with BatchNorms in them (which mine have).

 

With my new files I actually no longer have the list index out of range error

 

My new graph called optimised_graph.pb instead has the error

 

mvNCCompile /media/sf_vBox/optimized_graph.pb -in inputA -on a2b_generator/output_image /usr/local/bin/ncsdk/Controllers/Parsers/TensorFlowParser/Convolution.py:44: SyntaxWarning: assertion is always true, perhaps remove parentheses? assert(False, "Layer type not supported by Convolution: " + obj.type) mvNCCompile v02.00, Copyright @ Intel Corporation 2017 /usr/local/lib/python3.5/dist-packages/tensorflow/python/util/tf_inspect.py:45: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead shape: [1, 256, 256, 3] Traceback (most recent call last): File "/usr/local/bin/mvNCCompile", line 169, in <module> create_graph(args.network, args.image, args.inputnode, args.outputnode, args.outfile, args.nshaves, args.inputsize, args.weights, args.explicit_concat, args.ma2480, args.scheduler, args.new_parser, args) File "/usr/local/bin/mvNCCompile", line 148, in create_graph load_ret = load_network(args, parser, myriad_config) File "/usr/local/bin/ncsdk/Controllers/Scheduler.py", line 100, in load_network parse_ret = parse_tensor(arguments, myriad_conf) File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 319, in parse_tensor item_shape = output_item.shape.as_list() File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_shape.py", line 820, in as_list raise ValueError("as_list() is not defined on an unknown TensorShape.") ValueError: as_list() is not defined on an unknown TensorShape.

 

The new files can be found here: https://github.com/andrewginns/CycleGAN-Tensorflow-PyTorch/releases/tag/tf1.7-py3.6.4

 

Instructions to reproduce what I'm doing here: https://github.com/andrewginns/MSc-Project

0 Kudos
Reply