- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm trying to get a simple style transfer network working based on https://github.com/lengstrom/fast-style-transfer, but am having some issues that maybe you guys can help with. Here's the test code I'm using that builds the architecture, exports a checkpoint, and prints the in/out tensor names:
import tensorflow as tf
import os
import argparse
WEIGHTS_INIT_STDEV = .1
def build_net(image):
conv1 = _conv_layer(image, 16, 9, 1)
conv2 = _conv_layer(conv1, 32, 3, 2)
conv3 = _conv_layer(conv2, 64, 3, 2)
resid1 = _residual_block(conv3, 3)
resid2 = _residual_block(resid1, 3)
resid3 = _residual_block(resid2, 3)
resid4 = _residual_block(resid3, 3)
resid5 = _residual_block(resid4, 3)
conv_t1 = _conv_tranpose_layer(resid5, 32, 3, 2)
conv_t2 = _conv_tranpose_layer(conv_t1, 16, 3, 2)
conv_t3 = _conv_layer(conv_t2, 3, 9, 1, relu=False)
preds = tf.nn.tanh(conv_t3) * 150 + 127.5
return preds
def _conv_layer(net, num_filters, filter_size, strides, relu=True):
weights_init = _conv_init_vars(net, num_filters, filter_size)
strides_shape = [1, strides, strides, 1]
net = tf.nn.conv2d(net, weights_init, strides_shape, padding='SAME')
net = _instance_norm(net)
if relu:
net = tf.nn.relu(net)
return net
def _conv_tranpose_layer(net, num_filters, filter_size, strides):
weights_init = _conv_init_vars(net, num_filters, filter_size, transpose=True)
batch_size, rows, cols, in_channels = [i.value for i in net.get_shape()]
new_rows, new_cols = int(rows * strides), int(cols * strides)
new_shape = [batch_size, new_rows, new_cols, num_filters]
tf_shape = tf.stack(new_shape)
strides_shape = [1,strides,strides,1]
net = tf.nn.conv2d_transpose(net, weights_init, tf_shape, strides_shape, padding='SAME')
net = _instance_norm(net)
return tf.nn.relu(net)
def _residual_block(net, filter_size=3):
tmp = _conv_layer(net, 64, filter_size, 1)
return net + _conv_layer(tmp, 64, filter_size, 1, relu=False)
def reduce_var(x, axis=None, keepdims=False):
"""Variance of a tensor, alongside the specified axis."""
m = tf.reduce_mean(x, axis=axis, keep_dims=True)
devs_squared = tf.square(x - m)
return tf.reduce_mean(devs_squared, axis=axis, keep_dims=keepdims)
def reduce_std(x, axis=None, keepdims=False):
"""Standard deviation of a tensor, alongside the specified axis."""
return tf.sqrt(reduce_var(x, axis=axis, keepdims=keepdims))
def _instance_norm(net, train=True):
batch, rows, cols, channels = [i.value for i in net.get_shape()]
var_shape = [channels]
mu = tf.reduce_mean(net, axis=[1,2], keep_dims=True)
sigma = reduce_std(net, axis=[1,2], keepdims=True)
# mu, sigma_sq = tf.nn.moments(net, [1,2], keep_dims=True)
shift = tf.Variable(tf.zeros(var_shape))
scale = tf.Variable(tf.ones(var_shape))
# epsilon = 1e-3
# normalized = (net-mu)/(sigma_sq + epsilon)**(.5)
normalized = (net-mu) / sigma
return scale * normalized + shift
def _conv_init_vars(net, out_channels, filter_size, transpose=False):
_, rows, cols, in_channels = [i.value for i in net.get_shape()]
if not transpose:
weights_shape = [filter_size, filter_size, in_channels, out_channels]
else:
weights_shape = [filter_size, filter_size, out_channels, in_channels]
weights_init = tf.Variable(tf.truncated_normal(weights_shape, stddev=WEIGHTS_INIT_STDEV, seed=1), dtype=tf.float32)
return weights_init
##################################################################
CHECKPOINT_DIR = 'ckpt'
OUT_DIR = 'ncs_out'
parser = argparse.ArgumentParser()
parser.add_argument('--checkpoint-dir', type=str, default=CHECKPOINT_DIR,
help='checkpoint in dir')
parser.add_argument('--out-dir', type=str, default=OUT_DIR,
help='dir to save checkpoint out')
def main():
options = parser.parse_args()
img_shape = (300, 300, 3)
sess = tf.Session()
batch_shape = (1,) + img_shape
img_placeholder = tf.placeholder(tf.float32, shape=batch_shape,
name='input')
preds = build_net(img_placeholder)
# saver = tf.train.Saver()
# if os.path.isdir(options.checkpoint_dir):
# ckpt = tf.train.get_checkpoint_state(options.checkpoint_dir)
# if ckpt and ckpt.model_checkpoint_path:
# saver.restore(sess, ckpt.model_checkpoint_path)
# else:
# raise Exception("No checkpoint found...")
# else:
# saver.restore(sess, checkpoint_dir)
sess.run(tf.global_variables_initializer())
saver = tf.train.Saver(tf.global_variables())
saver.save(sess, os.path.join(options.out_dir, 'ncs'))
print('Input tensor name:', img_placeholder.name)
print('Output tensor name:', preds.name)
if __name__ == '__main__':
main()
When I try to compile ncs_out/ncs.meta
I run into two issues:
[Error 5] Toolkit Error: Stage Details Not Supported: Square
This appears to occur at line 1398 of TensorFlowParser.py because
(get_input(strip_tensor_id(node.inputs[0].name), False) != 0)
returns 0. That corresponds to line 53 in the above code which calculatesdevs_squared = tf.square(x - m)
as part of the Instance Normalization layer (which uses mean/std over spatial axes to normalize, essential for making style transfer work with this net). I'm calculating mean/std this way because tf.nn.moments doesn't seem to be supported.If I comment out the
net = _instance_norm(net)
instance norm layers on lines 26 & 43 I get a different error related totf.nn.conv2d_transpose
:[Error 5] Toolkit Error: Stage Details Not Supported: Wrong deconvolution output shape.
Not sure what's going on there. Removing the residual layers also results in the same deconvolution error.
Any help would be very much appreciated. Thanks!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@eridgd Looks like you have an interesting project with your style transfer network. If you can provide a link to your model, it would helpful in reproducing the issue and the debugging process.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply, @Tome_at_Intel
Please see here https://github.com/eridgd/fast-neural-style-ncs which includes the NCS-compatible checkpoint in ncs_graph/
. The procedure I'm using to generate this and attempt to compile is:
bash setup.sh
-- Download the original trained model checkpoint.python export-graph.py
mvNCCompile ncs_graph/ncs.meta -in="input" -on="add_21" -s 12 -o ncs-style.meta
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@eridgd I was able to reproduce your issue. It seems that we don't quite have the support for this model. At the moment, we don't have support for Instance Normalization, just Batch Normalization. I think our SDK is not accounting for Instance Normalization in the NCSDK's square implementation and this is likely where the issue is. Thanks for reporting this and bringing this to our attention. We will make a note of this issue for considerations in future releases.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I appreciate your taking the time to look into this @Tome_at_Intel
I made a little progress by switching to (Fused) Batch Norm instead of Instance Norm (https://github.com/eridgd/fast-neural-style-ncs/blob/a71d73221fa48cbcbe06acdb0b5ed032c80a8658/src/transform.py#L24). This produces worse quality results but should now be using only NCS-compatible ops. A checkpoint for this is at https://github.com/eridgd/fast-neural-style-ncs/tree/master/bn-vgg16-style5
However, I ran into a new issue with the conv2d_transpose
layers. When I compile with:
mvNCCompile bn-vgg16-style5/ncs.meta -in="input" -on="output" -s 12 -o ncs_bn.graph
It gives the error: [Error 5] Toolkit Error: Stage Details Not Supported: Wrong deconvolution output shape.
That error is thrown after calling get_deconv_padding()
(line 75 of TensorFlowParser.py) which returns -1 for padx/pady. The calculated pads should be 1, e.g.:
pady = stride[1] * (input_shape[1] - 1) + kernel_shape[0] - output_shape[1]
= 2 * (60 - 1) + 3 - 120 == 1
But this is changed to -1 because it isn't even:
if (pady % 2 == 1):
pady = -1
Not sure what that's intended to do. To avoid this error, I overrode this section to return padx/pady = 1. That change allows it to finish compiling to ncs_bn.graph
. I then try to run this on the NCS with:
python webcam_ncs.py --graph ncs_bn.graph
That seems to be able to load the graph onto the device, and running inference returns a (230400,) tensor that I reshape to (240, 320, 3). Regardless of the image I try the output is garbled, like https://github.com/eridgd/fast-neural-style-ncs/blob/master/fast%20style_screenshot_04.03.2018.png
I tried changing the output node to different intermediate layers and compared the output to eval'ing the same node in TF. The conv2d layers (including residual ones) have output similar to TF, as does the first conv2d_transpose layer. The output diverges at the second conv2d_transpose layer (node Relu_9
https://github.com/eridgd/fast-neural-style-ncs/blob/a71d73221fa48cbcbe06acdb0b5ed032c80a8658/src/transform.py#L15), and the outputs in the last layers have a bunch of nans.
Any ideas what could be going on?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does any one have a suggestion for how to resolve this? Deconvolution layers appear to be supported by the latest NCSDK, but perhaps they're not fully implemented as this architecture is a simple use of conv2d_transpose. I also searched the app zoo for an example that uses deconvolutions but was unable to find any.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I had a very interesting use in mind for this and also wanted to contribute to the app zoo with a style transfer example, but since I've received no feedback on the core issue with apparently broken deconvolutions I'm going to move on to better uses for my time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was seeing the same error; [Error 5] Toolkit Error: Stage Details Not Supported: Wrong deconvolution output shape.
but switching to padding='VALID'
I see things work for me (am using slim.conv2d_transpose
) (??)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Tome_at_Intel . Have you test the deconvolution officially said support?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@hzc I tried using NCSDK v 2.04.00.06 with this user's network and I no longer get the deconvolution error, but I did run into another issue with parsing the tensor (index out of range). It seems to fail line 290 in TensorFlowParser.py when checking the tensor's index 0.
Edit: I will have to try again. Had an issue on my machine that could have affected my earlier test.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Tome_at_Intel Any update about deconvolution? When i used bn after deconv, I got this error.
'Toolkit Error: Stage Details Not Supported: FusedBatchNorm must be preceded by convolution or fully connected layer'
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@hzc I don't think there were any changes specifcally to deconvolution, but please give NCSDK 2.04.00.06 a try and let me know if it resolves any of your issues. If there are issues, I can try to help debug any issues you are facing. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Tome_at_Intel . I have tried NCSDK 2.04.00.06. Still got '[Error 5] Toolkit Error: Stage Details Not Supported: Wrong deconvolution output shape.' And will you support bn after deconvolution?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@hzc Can you provide your network for issue reproduction and debugging?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is somehow to be linked with this topic. I am also trying to do style transfer but I had issues with paddings (size error), instance normalizations (StopGradient missing), and conv2d (for some reasons stride 2 is not accepted when profiling).
EDIT:
Looks like that if i use deconv2d like this
d1 = deconv2d(r9, options.gf_dim*2, 3, 2, padding='SAME', name='g_d1_dc')
it returns the output shape error while if I use VALID
it does not. Of course VALID
is not what I want because it messes up my network. Any idea?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@matpalm where did you apply those padding changes? When I make the change to the deconvolution stage, it falls over on:
assert _tensor_size(content_features_X) == _tensor_size(content_features_preds_pre)
in optimize.py
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Guys,
Any updates on this?
this could be really nice demo for the stick :)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page