Re: KeyError: "The name 'input:0' refers to a Tensor which does not exist." when compile retrained model

idata · ‎01-28-2018

Hello, guys. Now I retrained an Inception-v3 model to fine-tune it's output layer to 11 classes. Originally it may output 1001 classes but in our case we only need to calssify 11 classes so we retrained it.It works well but when I tried to compile it into graph some error happens and please help me.

For this model, I have a frozen .pb file and some ckpt files like model.ckpt.meta or model.ckpt.index and so on.

And in terminal, I use the command:

(tensorflow) wxy@wxy-mipro:~/Documents/TensorFLow/retrained/ckpt$ sudo mvNCCompile model.ckpt.meta -in=input -is 299 299 -o inception-V3-retrained.graph

and it outputs:

mvNCCompile v02.00, Copyright @ Movidius Ltd 2016

/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py:15: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses

import imp

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:766: DeprecationWarning: builtin type EagerTensor has no module attribute

EagerTensor = c_api.TFE_Py_InitEagerTensor(_EagerTensorBase)

/usr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6

return f(*args, **kwds)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/tf_inspect.py:45: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec()

if d.decorator_argspec is not None), _inspect.getargspec(target))

Traceback (most recent call last):

File "/usr/local/bin/mvNCCompile", line 118, in

create_graph(args.network, args.inputnode, args.outputnode, args.outfile, args.nshaves, args.inputsize, args.weights)

File "/usr/local/bin/mvNCCompile", line 104, in create_graph

net = parse_tensor(args, myriad_config)

File "/usr/local/bin/ncsdk/Controllers/TensorFlowParser.py", line 237, in parse_tensor

inputTensor = graph.get_tensor_by_name(inputnode + ':0')

File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3207, in get_tensor_by_name

return self.as_graph_element(name, allow_tensor=True, allow_operation=False)

File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3035, in as_graph_element

return self._as_graph_element_locked(obj, allow_tensor, allow_operation)

File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3077, in _as_graph_element_locked

"graph." % (repr(name), repr(op_name)))

KeyError: "The name 'input:0' refers to a Tensor which does not exist. The operation, 'input', does not exist in the graph."

idata · ‎01-29-2018

@WuXinyang You can try using Tensorflow's summarize graph tool to check for your input and ouput nodes. https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms. This should give you the input and output node names for your model and you can try again with the -in and -on options.

idata · ‎01-30-2018

@Tome_at_Intel , hi, thanks for your advice, I tried the summarize_graph and it outputs:

No inputs spotted.

No variables spotted.

Found 1 possible outputs: (name=final_result, op=Softmax)

Found 21842558 (21.84M) const parameters, 0 (0) variable parameters, and 99 control_edges

Op types used: 489 Const, 101 Identity, 99 CheckNumerics, 94 Relu, 94 BatchNormWithGlobalNormalization, 94 Conv2D, 11 Concat, 9 AvgPool, 5 MaxPool, 1 DecodeJpeg, 1 ExpandDims, 1 Cast, 1 MatMul, 1 Mul, 1 PlaceholderWithDefault, 1 Add, 1 Reshape, 1 ResizeBilinear, 1 Softmax, 1 Sub

So, I still did not get the inputs node's name :neutral:

And when I tried the command with the outputs name:

(tensorflow) wxy@wxy-mipro:~/Documents/TensorFLow/retrained/ckpt$ sudo mvNCCompile model.ckpt.meta -in=input -on=final_result -is 299 299 -o inception-V3-retrained.graph

it will get the same KeyError. So now I think the problem must be about the inputs node?

idata · ‎01-31-2018

@WuXinyang I can try to help debug your issue if you can provide your files (pb, meta files). Thanks.

idata · ‎01-31-2018

@Tome_at_Intel Thanks!!! I upload it into my Google Drive, the share link is as below:

https://drive.google.com/open?id=125pP5Nkmqf1eBxZnVfdMizAMfaKM2N4n

idata · ‎01-31-2018

@WuXinyang It looks like your input node's name is: input/BottleneckInputPlaceholder, so the entire command would be something like mvNCCompile model.ckpt.meta -is 299 299 -in=input/BottleneckInputPlaceholder -on=softmax -o inception-V3-retrained.graph.

However it seems that we don't have support for tf.range() operation yet. Looking at tf.range(), it seems to be a copy of the python range function and you should be able to replace the tf.range in your code with a constant. For example, tf.constant(list(range(4))).

idata · ‎02-01-2018

@Tome_at_Intel Thanks for your advice! I tried this network by using and modifying this script offered by Google:

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py

And I checked this script, there no tf.range() is involved, so maybe some modules or packages it calls involve the tf.range()…So now I shall inspect every packages and modules it calls to check the tf.range(), right? It will be quite a huge work :neutral: ..

Btw, I use tensorflow in my Conda environment with Python 2.7, and I find that your toolkit is in Python 3.6, will it be the problem?

idata · ‎02-02-2018

@WuXinyang Regarding the Python issue, it should be okay because we do have support for Python 2.7 in our API now. Regarding the range issue, I'm not sure where your network is using tf.range(), but that is the issue I'm seeing on my side when I run the compile command I listed above. Can you confirm that you are getting the range error as well?

idata · ‎02-03-2018

@Tome_at_Intel Yeah I got the same error like you. I'm sorry that I did not mention it before.

The error I got in my side is:

[Error 5] Toolkit Error: Stage Details Not Supported: Range

idata · ‎02-07-2018

@Tome_at_Intel Hi, do you have any other suggestions for retraining a Inception-v3 model?

I ever retrained it based on the scripts offered by Google's Tensorflow Slim library. Since I was not that familiar with Tensorflow that time, so I chose to make some modifications on Slim's scripts and retrained it. I never tried to retrain it totally from scratch by my own codes.

I think maybe I can retrain it again in another way, which can avoid the tf.range() method. Do you have some advice? Thanks a lot!

idata · ‎02-08-2018

@WuXinyang I think your latter plan may be the best plan of action. Let me know if you find success.

idata · ‎04-08-2018

@Tome_at_Intel Hi recently, I tried one script in this page https://movidius.github.io/ncsdk/TensorFlow.html, but modified it into this way:

import numpy as np

import tensorflow as tf

from tensorflow.contrib.slim.nets import inception

slim = tf.contrib.slim

def run(name, image_size, num_classes):

with tf.Graph().as_default():

image = tf.placeholder("float", [1, image_size, image_size, 3], name="input")

with slim.arg_scope(inception.inception_v3_arg_scope()):

logits, _ = inception.inception_v3(image, num_classes, is_training=False, spatial_squeeze=False)

probabilities = tf.nn.softmax(logits)

#init_fn = slim.assign_from_checkpoint_fn('inception_v1.ckpt', slim.get_model_variables('InceptionV1'))

with tf.Session() as sess:

    saver = tf.train.Saver(tf.global_variables())
    sess.run(tf.local_variables_initializer())
    saver.restore(sess, '.' + '/ckpt/model.ckpt')
    saver.save(sess, '.' + '/inference')

run('inception-v3', 299, 11)

and then I got many errors like this:

….

W tensorflow/core/framework/op_kernel.cc:1192] Not found: Key InceptionV3/Mixed_7b/Branch_0/Conv2d_0a_1x1/weights not found in checkpoint

W tensorflow/core/framework/op_kernel.cc:1192] Not found: Key InceptionV3/Mixed_7b/Branch_2/Conv2d_0c_1x3/weights not found in checkpoint

W tensorflow/core/framework/op_kernel.cc:1192] Not found: Key InceptionV3/Mixed_7b/Branch_1/Conv2d_0a_1x1/weights not found in checkpoint

W tensorflow/core/framework/op_kernel.cc:1192] Not found: Key InceptionV3/Mixed_7b/Branch_1/Conv2d_0a_1x1/BatchNorm/beta not found in checkpoint

…..

seems I need to make sure the every node of the retrained model the same as in ckpt? Do you have some new solutions now for my kind of problem? Since I noticed that many people are proposing the same questions like me in this forum: how to make retrained fune-tuned tensorflow model run on NCS? So I hope if you can develop some methods for this popular questions?

idata · ‎04-08-2018

@Tome_at_Intel Btw may I ask that in the network folder which I uploaded onto Google Drive, is the output_graph.pb a frozen graph or non-frozen one? Shall I first freeze it before compile or not?

idata · ‎04-09-2018

@WuXinyang Hi, ~~pb files are considered "frozen"~~ . For this model you are referring to, can you give me more information about the model? Is it a retrained version of the model with tf.range() removed? Or is it the same model and you are just trying to run a saver script on it?

Edit: Made a mistake. pb files can be frozen or unfrozen.

idata · ‎04-10-2018

@Tome_at_Intel

Hi, First very thanks to your fast reply.

Second, this model is the same model as the original one, namely without removing tf.range(), in fact it's kind of difficult for me to remove it :neutral: So it is just a try with a saver script. This sacer script is from your website, and in your website this script is used for Inception-V1. So i think maybe i can use it for my retrained Inception-V3.

FInally and most important , with the original output_graph.pb, I just suddenly half-successfully compiled it with the following code:

(tensorflow) wxy@wxy-mipro:~/tensorflow/model$ sudo mvNCCompile output_graph.pb -s 12 -is 299 299 -in=input/BottleneckInputPlaceholder -on=final_result -o retrained.pb

And I call it half-successfully, because the output of it is only a file with 45.9kb. This file can be loaded into NCS, but i am really doubt if it really contains any useful information? Since when I tried it with my inference code seems it cannot make any right inference.

Btw, although I can compile it now, but you see, I only changed the -in and -on. I still did not remove tf.range().

And also, after this compile, there are some warning information printed as following:

(tensorflow) wxy@wxy-mipro:~/tensorflow/model$ sudo mvNCCompile output_graph.pb -s 12 -is 299 299 -in=input/BottleneckInputPlaceholder -on=final_result -o retrained.pb

/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py:15: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses

import imp

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:766: DeprecationWarning: builtin type EagerTensor has no module attribute

EagerTensor = c_api.TFE_Py_InitEagerTensor(_EagerTensorBase)

/usr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6

return f(*args, **kwds)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/tf_inspect.py:45: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec()

if d.decorator_argspec is not None), _inspect.getargspec(target))

/usr/local/bin/ncsdk/Controllers/FileIO.py:52: UserWarning: You are using a large type. Consider reducing your data sizes for best performance

"Consider reducing your data sizes for best performance\033[0m")

idata · ‎04-10-2018

@Tome_at_Intel

Btw, I want to say my environment is Ubuntu 17.10 and I use tensorflow in my Anaconda virtual environment with Python 2.7.

idata · ‎04-11-2018

@Tome_at_Intel Hi I saw your edit today, and I want to say the output_graph.pb which I uploaded is a frozen one. I managed to use tensorflow's graph_transform tools to transform the ckpt and meta files of my retrained network into a frozen_graph.pb, and this file is exactly the same as the output_graph.pb. I use mvNCCompile to compile the frozen_graph.pb and got a same file with 45.9kb size.

idata · ‎04-12-2018

@WuXinyang Are you able to run an inference with the graph file you generated?

idata · ‎04-13-2018

@Tome_at_Intel Yes I can load it into NCS and get inference result, but the result is wrong. No matter I feed it any different images, it will just make one same inference, which is carrot with 100% possibility( I retrained this model for a fruits&vegetables classification problem and carrot is one class of totally 11 classes). And if I feed the same images into the network without NCS, it will output right inferences.

idata · ‎04-13-2018

@WuXinyang If your network has changed since the last time you linked it to me, I'd like to try out your network again and see if I can help you out.

idata · ‎04-16-2018

@Tome_at_Intel it's just that network the last time I linked to you, there is no change. The only difference is now I change the compile command:

(tensorflow) wxy@wxy-mipro:~/tensorflow/model$ sudo mvNCCompile output_graph.pb -s 12 -is 299 299 -in=input/BottleneckInputPlaceholder -on=final_result -o retrained.pb

And it output a file with 45.9kb, which is surely not right.