Style Transfer - Issues with Instance Normalization and Deconvolution (TF)

idata · ‎02-26-2018

Hi,

I'm trying to get a simple style transfer network working based on https://github.com/lengstrom/fast-style-transfer, but am having some issues that maybe you guys can help with. Here's the test code I'm using that builds the architecture, exports a checkpoint, and prints the in/out tensor names:

import tensorflow as tf
import os
import argparse

WEIGHTS_INIT_STDEV = .1

def build_net(image):
    conv1 = _conv_layer(image, 16, 9, 1)
    conv2 = _conv_layer(conv1, 32, 3, 2)
    conv3 = _conv_layer(conv2, 64, 3, 2)
    resid1 = _residual_block(conv3, 3)
    resid2 = _residual_block(resid1, 3)
    resid3 = _residual_block(resid2, 3)
    resid4 = _residual_block(resid3, 3)
    resid5 = _residual_block(resid4, 3)
    conv_t1 = _conv_tranpose_layer(resid5, 32, 3, 2)
    conv_t2 = _conv_tranpose_layer(conv_t1, 16, 3, 2)
    conv_t3 = _conv_layer(conv_t2, 3, 9, 1, relu=False)
    preds = tf.nn.tanh(conv_t3) * 150 + 127.5
    return preds

def _conv_layer(net, num_filters, filter_size, strides, relu=True):
    weights_init = _conv_init_vars(net, num_filters, filter_size)
    strides_shape = [1, strides, strides, 1]
    net = tf.nn.conv2d(net, weights_init, strides_shape, padding='SAME')
    net = _instance_norm(net)
    if relu:
        net = tf.nn.relu(net)

    return net

def _conv_tranpose_layer(net, num_filters, filter_size, strides):
    weights_init = _conv_init_vars(net, num_filters, filter_size, transpose=True)

    batch_size, rows, cols, in_channels = [i.value for i in net.get_shape()]
    new_rows, new_cols = int(rows * strides), int(cols * strides)

    new_shape = [batch_size, new_rows, new_cols, num_filters]
    tf_shape = tf.stack(new_shape)
    strides_shape = [1,strides,strides,1]

    net = tf.nn.conv2d_transpose(net, weights_init, tf_shape, strides_shape, padding='SAME')
    net = _instance_norm(net)
    return tf.nn.relu(net)

def _residual_block(net, filter_size=3):
    tmp = _conv_layer(net, 64, filter_size, 1)
    return net + _conv_layer(tmp, 64, filter_size, 1, relu=False)

def reduce_var(x, axis=None, keepdims=False):
    """Variance of a tensor, alongside the specified axis."""
    m = tf.reduce_mean(x, axis=axis, keep_dims=True)
    devs_squared = tf.square(x - m)
    return tf.reduce_mean(devs_squared, axis=axis, keep_dims=keepdims)

def reduce_std(x, axis=None, keepdims=False):
    """Standard deviation of a tensor, alongside the specified axis."""
    return tf.sqrt(reduce_var(x, axis=axis, keepdims=keepdims))

def _instance_norm(net, train=True):
    batch, rows, cols, channels = [i.value for i in net.get_shape()]
    var_shape = [channels]

    mu = tf.reduce_mean(net, axis=[1,2], keep_dims=True)
    sigma = reduce_std(net, axis=[1,2], keepdims=True)

    # mu, sigma_sq = tf.nn.moments(net, [1,2], keep_dims=True)
    shift = tf.Variable(tf.zeros(var_shape))
    scale = tf.Variable(tf.ones(var_shape))
    # epsilon = 1e-3
    # normalized = (net-mu)/(sigma_sq + epsilon)**(.5)
    normalized = (net-mu) / sigma
    return scale * normalized + shift

def _conv_init_vars(net, out_channels, filter_size, transpose=False):
    _, rows, cols, in_channels = [i.value for i in net.get_shape()]
    if not transpose:
        weights_shape = [filter_size, filter_size, in_channels, out_channels]
    else:
        weights_shape = [filter_size, filter_size, out_channels, in_channels]

    weights_init = tf.Variable(tf.truncated_normal(weights_shape, stddev=WEIGHTS_INIT_STDEV, seed=1), dtype=tf.float32)
    return weights_init


##################################################################
CHECKPOINT_DIR = 'ckpt'
OUT_DIR = 'ncs_out'
parser = argparse.ArgumentParser()
parser.add_argument('--checkpoint-dir', type=str, default=CHECKPOINT_DIR,
                    help='checkpoint in dir')
parser.add_argument('--out-dir', type=str, default=OUT_DIR,
                    help='dir to save checkpoint out')

def main():
    options = parser.parse_args()

    img_shape = (300, 300, 3)

    sess = tf.Session()
    batch_shape = (1,) + img_shape
    img_placeholder = tf.placeholder(tf.float32, shape=batch_shape,
                                     name='input')

    preds = build_net(img_placeholder)
    # saver = tf.train.Saver()

    # if os.path.isdir(options.checkpoint_dir):
    #     ckpt = tf.train.get_checkpoint_state(options.checkpoint_dir)
    #     if ckpt and ckpt.model_checkpoint_path:
    #         saver.restore(sess, ckpt.model_checkpoint_path)
    #     else:
    #         raise Exception("No checkpoint found...")
    # else:
    #     saver.restore(sess, checkpoint_dir)

    sess.run(tf.global_variables_initializer())

    saver = tf.train.Saver(tf.global_variables())
    saver.save(sess, os.path.join(options.out_dir, 'ncs'))

    print('Input tensor name:', img_placeholder.name)
    print('Output tensor name:', preds.name)

if __name__ == '__main__':
    main()

When I try to compile ncs_out/ncs.meta I run into two issues:

[Error 5] Toolkit Error: Stage Details Not Supported: Square

This appears to occur at line 1398 of TensorFlowParser.py because (get_input(strip_tensor_id(node.inputs[0].name), False) != 0) returns 0. That corresponds to line 53 in the above code which calculates devs_squared = tf.square(x - m) as part of the Instance Normalization layer (which uses mean/std over spatial axes to normalize, essential for making style transfer work with this net). I'm calculating mean/std this way because tf.nn.moments doesn't seem to be supported.

If I comment out the net = _instance_norm(net) instance norm layers on lines 26 & 43 I get a different error related to tf.nn.conv2d_transpose:

[Error 5] Toolkit Error: Stage Details Not Supported: Wrong deconvolution output shape.

Not sure what's going on there. Removing the residual layers also results in the same deconvolution error.

Any help would be very much appreciated. Thanks!

idata · ‎02-27-2018

@eridgd Looks like you have an interesting project with your style transfer network. If you can provide a link to your model, it would helpful in reproducing the issue and the debugging process.

idata · ‎02-27-2018

Thanks for the reply, @Tome_at_Intel

Please see here https://github.com/eridgd/fast-neural-style-ncs which includes the NCS-compatible checkpoint in ncs_graph/. The procedure I'm using to generate this and attempt to compile is:

bash setup.sh -- Download the original trained model checkpoint.

python export-graph.py

mvNCCompile ncs_graph/ncs.meta -in="input" -on="add_21" -s 12 -o ncs-style.meta

idata · ‎03-02-2018

@eridgd I was able to reproduce your issue. It seems that we don't quite have the support for this model. At the moment, we don't have support for Instance Normalization, just Batch Normalization. I think our SDK is not accounting for Instance Normalization in the NCSDK's square implementation and this is likely where the issue is. Thanks for reporting this and bringing this to our attention. We will make a note of this issue for considerations in future releases.

idata · ‎03-04-2018

I appreciate your taking the time to look into this @Tome_at_Intel

I made a little progress by switching to (Fused) Batch Norm instead of Instance Norm (https://github.com/eridgd/fast-neural-style-ncs/blob/a71d73221fa48cbcbe06acdb0b5ed032c80a8658/src/transform.py#L24). This produces worse quality results but should now be using only NCS-compatible ops. A checkpoint for this is at https://github.com/eridgd/fast-neural-style-ncs/tree/master/bn-vgg16-style5

However, I ran into a new issue with the conv2d_transpose layers. When I compile with:

mvNCCompile bn-vgg16-style5/ncs.meta -in="input" -on="output" -s 12 -o ncs_bn.graph

It gives the error: [Error 5] Toolkit Error: Stage Details Not Supported: Wrong deconvolution output shape.

That error is thrown after calling get_deconv_padding() (line 75 of TensorFlowParser.py) which returns -1 for padx/pady. The calculated pads should be 1, e.g.:

pady = stride[1] * (input_shape[1] - 1) + kernel_shape[0] - output_shape[1]

= 2 * (60 - 1) + 3 - 120 == 1

But this is changed to -1 because it isn't even:

if (pady % 2 == 1): pady = -1

Not sure what that's intended to do. To avoid this error, I overrode this section to return padx/pady = 1. That change allows it to finish compiling to ncs_bn.graph. I then try to run this on the NCS with:

python webcam_ncs.py --graph ncs_bn.graph

That seems to be able to load the graph onto the device, and running inference returns a (230400,) tensor that I reshape to (240, 320, 3). Regardless of the image I try the output is garbled, like https://github.com/eridgd/fast-neural-style-ncs/blob/master/fast%20style_screenshot_04.03.2018.png

I tried changing the output node to different intermediate layers and compared the output to eval'ing the same node in TF. The conv2d layers (including residual ones) have output similar to TF, as does the first conv2d_transpose layer. The output diverges at the second conv2d_transpose layer (node Relu_9 https://github.com/eridgd/fast-neural-style-ncs/blob/a71d73221fa48cbcbe06acdb0b5ed032c80a8658/src/transform.py#L15), and the outputs in the last layers have a bunch of nans.

Any ideas what could be going on?

idata · ‎03-16-2018

Does any one have a suggestion for how to resolve this? Deconvolution layers appear to be supported by the latest NCSDK, but perhaps they're not fully implemented as this architecture is a simple use of conv2d_transpose. I also searched the app zoo for an example that uses deconvolutions but was unable to find any.

idata · ‎03-23-2018

I had a very interesting use in mind for this and also wanted to contribute to the app zoo with a style transfer example, but since I've received no feedback on the core issue with apparently broken deconvolutions I'm going to move on to better uses for my time.

idata · ‎03-25-2018

I was seeing the same error; [Error 5] Toolkit Error: Stage Details Not Supported: Wrong deconvolution output shape. but switching to padding='VALID' I see things work for me (am using slim.conv2d_transpose) (??)

idata · ‎05-09-2018

@Tome_at_Intel . Have you test the deconvolution officially said support?

idata · ‎05-09-2018

@hzc I tried using NCSDK v 2.04.00.06 with this user's network and I no longer get the deconvolution error, but I did run into another issue with parsing the tensor (index out of range). It seems to fail line 290 in TensorFlowParser.py when checking the tensor's index 0.

Edit: I will have to try again. Had an issue on my machine that could have affected my earlier test.

idata · ‎05-24-2018

@Tome_at_Intel Any update about deconvolution? When i used bn after deconv, I got this error.

'Toolkit Error: Stage Details Not Supported: FusedBatchNorm must be preceded by convolution or fully connected layer'

idata · ‎05-29-2018

@hzc I don't think there were any changes specifcally to deconvolution, but please give NCSDK 2.04.00.06 a try and let me know if it resolves any of your issues. If there are issues, I can try to help debug any issues you are facing. Thanks.

idata · ‎05-30-2018

@Tome_at_Intel . I have tried NCSDK 2.04.00.06. Still got '[Error 5] Toolkit Error: Stage Details Not Supported: Wrong deconvolution output shape.' And will you support bn after deconvolution?

idata · ‎05-30-2018

@hzc Can you provide your network for issue reproduction and debugging?

idata · ‎07-04-2018

This is somehow to be linked with this topic. I am also trying to do style transfer but I had issues with paddings (size error), instance normalizations (StopGradient missing), and conv2d (for some reasons stride 2 is not accepted when profiling).

EDIT:

Looks like that if i use deconv2d like this d1 = deconv2d(r9, options.gf_dim*2, 3, 2, padding='SAME', name='g_d1_dc') it returns the output shape error while if I use VALID it does not. Of course VALID is not what I want because it messes up my network. Any idea?

idata · ‎07-09-2018

@matpalm where did you apply those padding changes? When I make the change to the deconvolution stage, it falls over on:

assert _tensor_size(content_features_X) == _tensor_size(content_features_preds_pre)

in optimize.py

idata · ‎07-28-2018

Hi Guys,

Any updates on this?

this could be really nice demo for the stick :)