Re: Is NCSDK V2 support depthwise convolution?

idata · ‎09-21-2018

I wanna implement SSD_MobileNet (not that one in ncappzoo), it ues depthwise convolution, and I saw Release Notes declare Depth convolution is supported, but when compiled I encounter this error: Unknown layer type: DepthwiseConvolution. I use SSD_caffemodel to compile ,and use tensorflow = 1.3.0 python = 3.5.2 . Could you give some advise pls.

idata · ‎09-21-2018

@curry_best Can you post a link to the model? You said you use SSD_caffemodel and use tensorflow 1.3.0? The NCSDK does support depthwise convolution but only for 3x3 convolutions atm. If you are using SSD MobileNet for TensorFlow, the NCSDK doesn't have support for this yet.

idata · ‎09-25-2018

@Tome_at_Intel

I use a caffe model, sorry I just post a part of prototxt but the other depthwise convolution layer as same as the post one. And I use the command pip3 install tensorflow==1.7.0, when I compile the model, I got a error "Illegal instruction (core dumped)". How to repair it，pls give some advise thanks.

layer {

name: "conv1/dw"

type: "DepthwiseConvolution"

bottom: "conv0"

top: "conv1/dw"

param {

lr_mult: 1.0

decay_mult: 1.0

}

param {

lr_mult: 2.0

decay_mult: 0.0

}

convolution_param {

num_output: 32

pad: 1

kernel_size: 3

group: 32

weight_filler {

type: "msra"

}

bias_filler {

type: "constant"

value: 0.0

}

idata · ‎09-25-2018

@Tome_at_Intel

This is Traceback when compile SSD caffemodel.

/usr/lib/python3/dist-packages/scipy/stats/morestats.py:16: DeprecationWarning: Importing from numpy.testing.decorators is deprecated, import from numpy.testing instead.

from numpy.testing.decorators import setastest

/usr/local/bin/ncsdk/Controllers/Parsers/TensorFlowParser/Convolution.py:44: SyntaxWarning: assertion is always true, perhaps remove parentheses?

assert(False, "Layer type not supported by Convolution: " + obj.type)

mvNCCompile v02.00, Copyright @ Intel Corporation 2017

WARNING: Logging before InitGoogleLogging() is written to STDERR

F0925 15:21:39.742235 4993 layer_factory.hpp:81] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: DepthwiseConvolution (known types: AbsVal, Accuracy, AnnotatedData, ArgMax, BNLL, BatchNorm, BatchReindex, Bias, Concat, ContrastiveLoss, Convolution, Crop, Data, Deconvolution, DetectionEvaluate, DetectionOutput, Dropout, DummyData, ELU, Eltwise, Embed, EuclideanLoss, Exp, Filter, Flatten, HDF5Data, HDF5Output, HingeLoss, Im2col, ImageData, InfogainLoss, InnerProduct, Input, LRN, LSTM, LSTMUnit, Log, MVN, MemoryData, MultiBoxLoss, MultinomialLogisticLoss, Normalize, PReLU, Parameter, Permute, Pooling, Power, PriorBox, Python, RNN, ReLU, Reduction, Reshape, SPP, Scale, Sigmoid, SigmoidCrossEntropyLoss, Silence, Slice, SmoothL1Loss, Softmax, SoftmaxWithLoss, Split, TanH, Threshold, Tile, VideoData, WindowData)

*** Check failure stack trace: ***

Aborted (core dumped)

idata · ‎09-25-2018

@curry_best Ah okay. I was not aware of the DepthwiseConvolution layer. It seems there is a special version of Caffe that implements a special named layer "DepthwiseConvolution". The NCSDK doesn't support this version of Caffe at the moment.

Depthwise convolution can be done in SSD Caffe or regular Caffe using the Convolution layer and the convolution parameter "kernel_size" or "kernel_h" and "kernel_w" and group. http://caffe.berkeleyvision.org/tutorial/layers/convolution.html.

idata · ‎09-26-2018

@Tome_at_Intel

Thank you! I realized that I can use group + convolution instead of Depthwise convolution.

idata · ‎10-03-2018

@Tome_at_Intel , so the "group" parameter of a caffe Convolution layer is not available on ncsdk 2?

@curry_best , can you explain the group + convolution approach a bit further? I admit, although I understand the grouping used by the AlexNet, I'm a bit confused by the SSD caffe version. If a convolutional layer has 256 input channels and 512 output channels, then if group=1, the parameter array size is (512,256,k,k) where k is the kernel size. The usage is pretty clear. But at the other extreme, if group=256, the parameters array size is (512,1,k,k). I'm not clear how that 4-d kernel is used. There are effectively 512 (k,k) filters, one for each output channel. Is each (k,k) filter replicated across the 256 input channels and then applied as in the case of group=1? But even if so… if group=2, the parameter array size is (512,2,k,k), in which case I'm really struggling to see how the filters are applied across the depth of the input.

idata · ‎10-03-2018

@mattroos The group paramete is available. You can see it in Mobilenet SSD and Alexnet in our Caffe examples: https://github.com/movidius/ncappzoo/tree/master/caffe

idata · ‎10-04-2018

I now understand exactly what grouping is intended to do, thanks to the first answer, here: https://stackoverflow.com/questions/40872914/caffe-what-does-the-group-param-mean. Notable to this thread, there is no true convolution across channels, just across height and width dimensions.

@Tome_at_Intel , thanks! However, I'm getting some unexpected results. For example, see the small network below, with two convolution layers. I use grouping in the second convolution layer. When group=1 or group=32 (the total number of output channels), I get an output that is the expected size: 32*(224/2/2)*(224/2/2) = 32*56*56 = 100352.

But when I set group=2, the output size is 100352/2 = 50176.

And when I set group=16, the output size is 100352/16 = 6272.

If I load the network into python on my development platform and look at the output blob size, using this code:

net = caffe.Net(filename_proto, filename_caffe, caffe.TEST)
for key in net.blobs.keys():
    print (key, net.blobs[key].data.flatten().shape)

the flattened output blob size is always 100352 as expected, regardless of what the group parameter is set to.

Any thoughts on this? Regarding Mobile SSD, it always uses a number of groups equal to the number of outputs, so it may be working fine in that case, based on what I've observed. For AlexNet, maybe it's not doing exactly what is expected since group=2, but still managing to give relatively good results.

My small test network:

name: "tiny_ncsdk_dts"
input: "data"
input_shape {
  dim:1
  dim:1
  dim:224
  dim:224 }
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 32
    kernel_size: 3
    pad: 1
    stride: 2
    bias_term: false
    weight_filler {
      type: "xavier"
    }
  }
}
layer{
  name: "conv2"
  type: "Convolution"
  bottom: "conv1"
  top: "conv2"
  convolution_param {
    num_output: 32
    kernel_size: 3
    pad: 1
    stride: 2
    group: 1
    bias_term: false
    weight_filler {
      type: "xavier"
    }
  }
}

idata · ‎10-04-2018

@Tome_at_Intel , to investigate this issue even further, I added fully-connected and softmax layers to my small network, to create a (untrained) 5-way classifier (see very bottom of this post). When group is set to 1 or 32 in the 'conv2' layer, results with mvNCCheck look ok. But ncs and the host machine give substantially different results if group is 2 or 16. See below. Result discrepancy is even more evident if one looks at the output of the 'fc1' layer rather than the final 'prob' softmax layer.

mvNCCheck for group=1 (good result):

root@synapse:~/Code/DeepTextSpotter/models# mvNCCheck debug.prototxt -w debug.caffemodel -s 12
mvNCCheck v02.00, Copyright @ Intel Corporation 2017

/usr/local/bin/ncsdk/Controllers/FileIO.py:65: UserWarning: You are using a large type. Consider reducing your data sizes for best performance
Blob generated
USB: Transferring Data...
USB: Myriad Execution Finished
USB: Myriad Connection Closing.
USB: Myriad Connection Closed.
Result:  (5,)
1) 1 0.28979
2) 2 0.26099
3) 3 0.22766
4) 0 0.11145
5) 4 0.11029
Expected:  (5,)
1) 1 0.29224
2) 2 0.26099
3) 3 0.22668
4) 0 0.11041
5) 4 0.10962
------------------------------------------------------------
 Obtained values 
------------------------------------------------------------
 Obtained Min Pixel Accuracy: 0.8354218676686287% (max allowed=2%), Pass
 Obtained Average Pixel Accuracy: 0.35087717697024345% (max allowed=1%), Pass
 Obtained Percentage of wrong values: 0.0% (max allowed=0%), Pass
 Obtained Pixel-wise L2 error: 0.4446218576686774% (max allowed=1%), Pass
 Obtained Global Sum Difference: 0.005126953125
------------------------------------------------------------

mvNCCheck for group=32 (good result):

Result:  (5,)
1) 0 0.34985
2) 2 0.22205
3) 4 0.16357
4) 3 0.1554
5) 1 0.10889
Expected:  (5,)
1) 0 0.34937
2) 2 0.22412
3) 4 0.16333
4) 3 0.1543
5) 1 0.10876
------------------------------------------------------------
 Obtained values 
------------------------------------------------------------
 Obtained Min Pixel Accuracy: 0.5939902272075415% (max allowed=2%), Pass
 Obtained Average Pixel Accuracy: 0.2306079724803567% (max allowed=1%), Pass
 Obtained Percentage of wrong values: 0.0% (max allowed=0%), Pass
 Obtained Pixel-wise L2 error: 0.3089824034861516% (max allowed=1%), Pass
 Obtained Global Sum Difference: 0.0040283203125
------------------------------------------------------------

mvNCCheck for group=2 (bad result):

root@synapse:~/Code/DeepTextSpotter/models# mvNCCheck debug.prototxt -w debug.caffemodel -s 12
mvNCCheck v02.00, Copyright @ Intel Corporation 2017

/usr/local/bin/ncsdk/Controllers/FileIO.py:65: UserWarning: You are using a large type. Consider reducing your data sizes for best performance
Blob generated
USB: Transferring Data...
USB: Myriad Execution Finished
USB: Myriad Connection Closing.
USB: Myriad Connection Closed.
Result:  (5,)
1) 2 0.20068
2) 4 0.20032
3) 3 0.19983
4) 0 0.19983
5) 1 0.19946
Expected:  (5,)
1) 4 0.40283
2) 2 0.21973
3) 1 0.21045
4) 0 0.096375
5) 3 0.070801
------------------------------------------------------------
 Obtained values 
------------------------------------------------------------
 Obtained Min Pixel Accuracy: 50.27272701263428% (max allowed=2%), Fail
 Obtained Average Pixel Accuracy: 23.08788001537323% (max allowed=1%), Fail
 Obtained Percentage of wrong values: 100.0% (max allowed=0%), Fail
 Obtained Pixel-wise L2 error: 29.129463088678975% (max allowed=1%), Fail
 Obtained Global Sum Difference: 0.46502685546875
------------------------------------------------------------

mvNCCheck for group=16 (bad result):

root@synapse:~/Code/DeepTextSpotter/models# mvNCCheck debug.prototxt -w debug.caffemodel -s 12
mvNCCheck v02.00, Copyright @ Intel Corporation 2017

/usr/local/bin/ncsdk/Controllers/FileIO.py:65: UserWarning: You are using a large type. Consider reducing your data sizes for best performance
Blob generated
USB: Transferring Data...
USB: Myriad Execution Finished
USB: Myriad Connection Closing.
USB: Myriad Connection Closed.
Result:  (5,)
1) 2 0.20239
2) 3 0.20105
3) 0 0.19995
4) 4 0.19873
5) 1 0.19812
Expected:  (5,)
1) 4 0.28027
2) 3 0.23462
3) 0 0.20471
4) 1 0.14758
5) 2 0.13281
------------------------------------------------------------
 Obtained values 
------------------------------------------------------------
 Obtained Min Pixel Accuracy: 29.09407615661621% (max allowed=2%), Fail
 Obtained Average Pixel Accuracy: 17.125436663627625% (max allowed=1%), Fail
 Obtained Percentage of wrong values: 80.0% (max allowed=0%), Fail
 Obtained Pixel-wise L2 error: 19.668538185011297% (max allowed=1%), Fail
 Obtained Global Sum Difference: 0.239990234375
------------------------------------------------------------

mvNCCheck for group=32, output node before the final softmax (good result):

root@synapse:~/Code/DeepTextSpotter/models# mvNCCheck debug.prototxt -w debug.caffemodel -s 12 -on fc1
mvNCCheck v02.00, Copyright @ Intel Corporation 2017

/usr/local/bin/ncsdk/Controllers/FileIO.py:65: UserWarning: You are using a large type. Consider reducing your data sizes for best performance
Blob generated
USB: Transferring Data...
USB: Myriad Execution Finished
USB: Myriad Connection Closing.
USB: Myriad Connection Closed.
Result:  (5,)
1) 2 1.1904
2) 4 0.2959
3) 1 -0.45923
4) 3 -1.2158
5) 0 -1.2461
Expected:  (5,)
1) 2 1.1982
2) 4 0.28809
3) 1 -0.46411
4) 3 -1.2451
5) 0 -1.291

mvNCCheck for group=16, output node before the final softmax (very bad result):

root@synapse:~/Code/DeepTextSpotter/models# mvNCCheck debug.prototxt -w debug.caffemodel -s 12 -on fc1
mvNCCheck v02.00, Copyright @ Intel Corporation 2017

/usr/local/bin/ncsdk/Controllers/FileIO.py:65: UserWarning: You are using a large type. Consider reducing your data sizes for best performance
Blob generated
USB: Transferring Data...
USB: Myriad Execution Finished
USB: Myriad Connection Closing.
USB: Myriad Connection Closed.
Result:  (5,)
1) 2 0.005909
2) 0 -0.00030708
3) 4 -0.00091028
4) 1 -0.0018187
5) 3 -0.0096436
Expected:  (5,)
1) 2 -0.17053
2) 1 -0.2146
3) 4 -0.44727
4) 3 -0.65723
5) 0 -1.4619

Here is the full network:

name: "tiny_ncsdk_dts"
input: "data"
input_shape {
  dim:1
  dim:3
  dim:224
  dim:224 }
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 32
    kernel_size: 3
    pad: 1
    stride: 2
    bias_term: false
    weight_filler {
      type: "xavier"
    }
  }
}
layer{
  name: "conv2"
  type: "Convolution"
  bottom: "conv1"
  top: "conv2"
  convolution_param {
    num_output: 32
    kernel_size: 3
    pad: 1
    stride: 2
    group: 32
    bias_term: false
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "conv2"
  top: "fc1"
  inner_product_param {
    num_output: 5
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "fc1"
  top: "prob"
}

idata · ‎10-10-2018

@mattroos Thanks for reporting this. I was able to reproduce the same issue that you noticed and we are looking into this issue. I'll let you know as soon as we find the root cause. Thanks again.

idata · ‎10-16-2018

@Tome_at_Intel, thank you.