- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I want to compile the Xception network to run on NCS. Unfortunately, the network size seems pretty big to fit into the stick, so I got the error:
[Error 35] Setup Error: Not enough resources on Myriad to process this network.
After looking for where it happened, it turns out that everytime the parser finds a separable conv. layer, it attempts to fuse the depthwise and pointwise conv operations together. Up to some layers, the fusing seem too big to handle within the Myriad memory so that the error is thrown.
I found a hack but not sure the consequence. From the file
/usr/local/bin/ncssdk/Models/NetworkStage.py
line 394, I disable the if condition so that it won't fuse anymore. The model now was compiled nicely. However, its outputs are just nan's. That is why I think my hack went terribly wrong.
If I enable again the fusing, and just let the output layer somewhere in the middle (so that the memory won't be insufficient), the output this time is valid.
So my question is, is there any "safe hack" to disable layer fusion so that compiling "a slightly larger" network won't be a problem. This is totally fine with NCS, just a bit slower when doing separate depthwise and then pointwise conv. And indeed even if the mvncCompile
parsed the network successfully, layer fusion can create troubles later on, at runtime. In my case, when I enable fusion and set the output at 10-th layer, the network is still compiled, but at runtime it announce Matmul scratch memory [204800] lower than required [239882]
. So, layer fusion is a double blade.
The code snippet for layer fusion is the following (taken from NCSSDK)
if (stage.op == StageType.convolution and self.op == StageType.depthwise_convolution and
stage.radixX == 1 and stage.radixY == 1 and self.postOp == StageType.none):
print('Fusing depthconv and conv in',self.unprocessed_name,'and',stage.unprocessed_name)
#Create the weights for a convolution that does deptwhise convolution (inCH, outCH, kH, kW)
taps = np.zeros([self.inputDimZ, self.tapDimZ, self.radixY, self.radixX], np.float32)
multiplier = int(self.tapDimZ/self.tapDimY)
for y in range(self.radixY):
for x in range(self.radixX):
for c in range(self.tapDimY):
for i in range(multiplier):
taps[c,c*multiplier+i,y,x] = self.taps[y,x,c,i]
#Turn them to [kH, kW, inCH, outCH) in order to be able to use matmul
taps = taps.transpose(2,3,0,1)
#Fuse the weights of the following 1x1 convolution into the just created weights
stage.taps = np.matmul(taps,stage.taps[0,0])
#Bring some data from the previous stage (self) to this one (stage) as we are saving this one
#Saving the previous node would be simpler, but unfortunately the parser keeps track
#of what's the latest created node (stage), so we must keep it
stage.inputDimX = self.inputDimX
stage.inputDimY = self.inputDimY
stage.inputDimZ = self.inputDimZ
stage.inputStrideX = self.inputStrideX
stage.inputStrideY = self.inputStrideY
stage.inputStrideZ = self.inputStrideZ
stage.tapDimX = self.tapDimX
stage.tapDimY = self.tapDimY
stage.radixX = self.radixX
stage.radixY = self.radixY
stage.strideX = self.strideX
stage.strideY = self.strideY
stage.padStyle = self.padStyle
stage.top = self.top
stage.data = self.data
stage.dataIndex = self.dataIndex
stage.dataPointer = self.dataPointer
#Remove self from network and change references
self.network.count = self.network.count - 1
self.network.stageslist.remove(self)
stage.top = self.top
if self in self.network.head:
stage.network.storageOrder = stage.storageOrder
self.network.head.remove(self)
self.network.head.append(stage)
else:
for parents in self.network.search_several(self.top):
newtail = []
for p in parents.tail:
if p == self:
newtail.append(stage)
parents.tail = newtail
return
- Tags:
- Neural Networks
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@dpvo Hm. A possible w/a is to find where the convolutions are "too big" and reduce the dimensions of the input to that layer. Can you provide your model?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @Tome_at_Intel ! I have tried reducing the input image dimension from 299 x 299 down to 227 x 227 but still out of memory. I guess reducing more will severely affect model's performance. The thing is, if I disable the fusion of the depthwise conv. layer and the pointwise conv. layer (by commenting out the if
condition in /usr/local/bin/ncsdk/Models/NetworkStage.py
at line 394 or nearby there) then the compilation went fine. Just that the NaN's start to appear after several layers and then spread all over the higher layers.
You can find all the resources (output blobs saved as numpy txt files, plots, and NCS model file .graph) from the following Dropbox folder: https://www.dropbox.com/sh/xk1jyzte90ejnyv/AACVSrb-XK6rh6TGoHtbFvP3a?dl=0
In the following I show some of my efforts debugging this issue.
For a reference of Xception architecture, please refer to https://github.com/keras-team/keras/blob/master/keras/applications/xception.py
Without layer fusion, the first layer of separable convolution block2_sepconv1
is still "fine" because there is no NaN values and its output tensor is mostly identical to that of an Xception model with fusion (I was be able to compile several first layers with fusion so that there is no out of memory error). The comparison in term of output magnitude is shown in the below
But from the second layer of separable convolution block2_sepconv2
, NaN occurs, just a small portion of 2% of out the total number of output entries of that layer (as seen below). (I masked out all NaN entries to be able to plot this histogram) At the same time, the absolute values of output grow as the following
In subsequent layers, the number of NaN grows quicker and output magnitudes grow too, like these
and
Until block4_sepconv1
all of the output entries are NaN.
You can have a look at those output tensors that I saved them as txt numpy matrices in the same Dropbox folder. Actually I have no idea how and why NaN start to happen. Any suggestion is appreciated!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Essentially, all of my problems can be posed as a single question "Whether NCS support Depthwise Convolution under any circumstance?", which means only Depthwise Convolution and no Pointwise Convolution afterward (as in the case of SeparableConv2D), then NCS still support it? I read from the release notes that it does support the operation. But from what I experience, it is unsure. Because if the operation is supported, then it would not cause any trouble if I disable layer fusion.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It turns out that layer fusion may not be the source of problem but a different one. I have a chance to finetune InceptionV3 on the same data. After compiled to NCS graph format. Testing that graph also gives me NaN outputs at the end of the network. Note that my models work well as original formats. The NaN values occur from intermediate layers while the first few layers values are still valid. My input size is 299 x 299, float16, normalized as img = img/127.5 - 1 (as usual for inceptionv3 network). Input and output names for all the graphs are input
and predictions/Softmax
. @Tome_at_Intel could you please have a look at my model? https://www.dropbox.com/s/nweom7cpnsnqxyq/InceptionV3.graph?dl=0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@dpvo I was able to run an inference with your provided graph file and received the same results you did (all nans). Can you do me a favor and try using mvNCCheck with your version of the InceptionV3 model and post the results back here? If you could also provide the actual model (meta file), that would be nice also.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Tome_at_Intel I really appreciate your help. In the below you can find the log of mvNCCheck that I ran mvNCCheck ../models/inceptionv3/InceptionV3_noBN.meta -s 12 -in input -on predictions/Softmax -is 299 299 -i example_classid_0.jpg -id 0 -S 127.5 -M 1 -cs 0,1,2
[1mmvNCCheck v02.00, Copyright @ Movidius Ltd 2016[0m
Layer conv2d_21/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_27/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_32/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_42/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_52/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_62/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_71/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_79/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_80/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_83/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_84/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_78/BiasAdd forced to im2col_v2, because its output is used in concat
Layer activation_85/Relu forced to im2col_v2, because its output is used in concat
Layer conv2d_88/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_89/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_92/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_93/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_87/BiasAdd forced to im2col_v2, because its output is used in concat
Layer activation_94/Relu forced to im2col_v2, because its output is used in concat
Layer activation_93/Relu forced to im2col_v2, because its output is used in concat
Layer activation_84/Relu forced to im2col_v2, because its output is used in concat
Layer conv2d_61/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_51/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_41/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_31/BiasAdd forced to im2col_v2, because its output is used in concat
Layer conv2d_20/BiasAdd forced to im2col_v2, because its output is used in concat
USB: Transferring Data...
USB: Myriad Execution Finished
USB: Myriad Connection Closing.
USB: Myriad Connection Closed.
Result: (1, 1, 45)
1) 44 nan
2) 21 nan
3) 19 nan
4) 18 nan
5) 17 nan
Expected: (1, 45)
1) 5 1.0
2) 18 6.2048e-05
3) 29 1.0788e-05
4) 31 5.9605e-08
5) 44 0.0
------------------------------------------------------------
Obtained values
------------------------------------------------------------
Obtained Min Pixel Accuracy: nan% (max allowed=2%), [91mFail[0m
Obtained Average Pixel Accuracy: nan% (max allowed=1%), [91mFail[0m
Obtained Percentage of wrong values: 0.0% (max allowed=0%), [91mFail[0m
Obtained Pixel-wise L2 error: nan% (max allowed=1%), [91mFail[0m
Obtained Global Sum Difference: nan
------------------------------------------------------------
I will inbox you the link to download the inception meta files.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page