Community
cancel
Showing results for 
Search instead for 
Did you mean: 
idata
Community Manager
291 Views

In SemanticSegmentation all detection results become "Nan"

Hello.

 

I am challenging UNet's Semantic Segmentation.

 

I succeeded in generating the model, but at the time of execution all the results are in trouble "Nan".

 

I can not tell whether there is a problem with image preprocessing or if there is a problem with the model.

 

Will not someone please help me?

 

DL Framework: Tensorflow

 

Input resolution: 128 x 128

 

Dataset: Pascal VOC 2012

 

My Model(CheckPoint and Graph): https://drive.google.com/file/d/1WFeY2VyFS7PKGPTI9oC9oxxVESu4ShZe/view?usp=sharing
Tags (1)
0 Kudos
5 Replies
idata
Community Manager
52 Views

Detailed Per Layer Profile

 

Bandwidth time Name MFLOPs (MB/s) (ms) ==================================================== 0 conv2d/Relu 56.6 276.3 3.070 1 conv2d_1/Relu 1208.0 667.4 27.076 2 max_pooling2d/MaxPool 1.0 979.8 2.041 3 conv2d_2/Relu 604.0 414.1 11.212 4 conv2d_3/Relu 1208.0 304.5 30.483 5 max_pooling2d_1/MaxPool 0.5 977.2 1.024 6 conv2d_4/Relu 604.0 198.5 14.188 7 conv2d_5/Relu 1208.0 172.7 32.592 8 max_pooling2d_2/MaxPool 0.3 952.9 0.525 9 conv2d_6/Relu 604.0 262.7 12.883 10 conv2d_7/Relu 1208.0 279.4 24.189 11 max_pooling2d_3/MaxPool 0.1 918.5 0.273 12 conv2d_8/Relu 604.0 700.1 13.684 13 conv2d_9/Relu 1208.0 703.9 27.196 14 conv2d_transpose/Relu 0.0 395.4 10.436 15 conv2d_10/Relu 2415.9 280.6 48.150 16 conv2d_11/Relu 1208.0 281.8 23.986 17 conv2d_transpose_1/Relu 0.0 225.1 5.555 18 conv2d_12/Relu 2415.9 203.5 55.309 19 conv2d_13/Relu 1208.0 194.9 28.888 20 conv2d_transpose_2/Relu 0.0 139.2 5.390 21 conv2d_14/Relu 2415.9 306.4 60.587 22 conv2d_15/Relu 1208.0 307.5 30.193 23 conv2d_transpose_3/Relu 0.0 147.0 7.230 24 conv2d_16/Relu 2415.9 646.4 55.913 25 conv2d_17/Relu 1208.0 668.3 27.041 26 output/BiasAdd 46.1 1230.8 1.627 ---------------------------------------------------- Total inference time 560.74 ----------------------------------------------------
idata
Community Manager
52 Views

@PINTO Looks like after conv2d_5/Relu layer, the results exceed the fp16 value limit.

idata
Community Manager
52 Views

$ mvNCCheck deployfinal.ckpt.meta -s 12 -on max_pooling2d_2/MaxPool /usr/lib/python3/dist-packages/scipy/stats/morestats.py:16: DeprecationWarning: Importing from numpy.testing.decorators is deprecated, USB: Myriad Execution Finished USB: Myriad Connection Closing. USB: Myriad Connection Closed. Result: (16, 16, 256) 1) 4582 27180.0 2) 7142 26780.0 3) 4838 26720.0 4) 7910 26420.0 5) 7398 26380.0 Expected: (16, 16, 256) 1) 4582 27224.69 2) 7142 26751.467 3) 4838 26726.771 4) 7398 26461.34 5) 7910 26399.03 ------------------------------------------------------------ Obtained values ------------------------------------------------------------ Obtained Min Pixel Accuracy: 0.40614702738821507% (max allowed=2%), Pass Obtained Average Pixel Accuracy: 0.0062270752096083015% (max allowed=1%), Pass Obtained Percentage of wrong values: 0.0% (max allowed=0%), Pass Obtained Pixel-wise L2 error: 0.022377591814212915% (max allowed=1%), Pass Obtained Global Sum Difference: 111103.3046875 ------------------------------------------------------------

 

and then in the next layer:

 

$ mvNCCheck deployfinal.ckpt.meta -s 12 -on conv2d_6/Relu [1mmvNCCheck v02.00, Copyright @ Intel Corporation 2017[0m USB: Transferring Data... USB: Myriad Execution Finished USB: Myriad Connection Closing. USB: Myriad Connection Closed. Result: (16, 16, 512) 1) 117316 nan 2) 115780 nan 3) 115268 nan 4) 116450 nan 5) 117050 nan Expected: (16, 16, 512) 1) 125676 358967.16 2) 126188 354048.88 3) 123628 351221.84 4) 130284 350524.97 5) 125164 348793.38 ------------------------------------------------------------ Obtained values ------------------------------------------------------------ Obtained Min Pixel Accuracy: nan% (max allowed=2%), [91mFail[0m Obtained Average Pixel Accuracy: nan% (max allowed=1%), [91mFail[0m Obtained Percentage of wrong values: 7.704925537109375% (max allowed=0%), [91mFail[0m Obtained Pixel-wise L2 error: nan% (max allowed=1%), [91mFail[0m Obtained Global Sum Difference: nan ------------------------------------------------------------
idata
Community Manager
52 Views

@Tome_at_Intel

 

Thank you for your polite answer.

 

I first learned how to use "mvNCCheck" from you.

 

Then, I adjusted the input resolution, the filter size and the number of classes.

 

However, even if I adjust the input resolution and filter size and number of class, overflow will occur.

 

I feel that the behavior of Deconv is strange.

 

Although it is another topic, If I run it several times under the same conditions on the same layer, "mvNCCeck" will succeed or fail.

 

It seems that the movement is not stable.

 

I am trying to give up the conversion. . .

 

By the way, Pure Tensorflow has succeeded in high-speed one-class segmentation.

 

Detailed Per Layer Profile Bandwidth time # Name MFLOPs (MB/s) (ms) ============================================================================================ 0 conv2d/Relu/batch_normalization/FusedBatchNorm 7.1 1002.5 0.842 1 conv2d_1/Relu/batch_normalization_1/FusedBatchNorm 18.9 943.6 2.386 2 max_pooling2d/MaxPool 0.1 380.1 0.658 3 conv2d_2/Relu/batch_normalization_2/FusedBatchNorm 9.4 750.9 0.752 4 conv2d_3/Relu/batch_normalization_3/FusedBatchNorm 18.9 914.7 1.235 5 max_pooling2d_1/MaxPool 0.1 489.8 0.255 6 conv2d_4/Relu/batch_normalization_4/FusedBatchNorm 9.4 476.2 0.610 7 conv2d_5/Relu/batch_normalization_5/FusedBatchNorm 18.9 729.9 0.795 8 max_pooling2d_2/MaxPool 0.0 521.5 0.120 9 conv2d_6/Relu/batch_normalization_6/FusedBatchNorm 9.4 426.3 0.415 10 conv2d_7/Relu/batch_normalization_7/FusedBatchNorm 18.9 485.8 0.726 11 max_pooling2d_3/MaxPool 0.0 481.7 0.065 12 conv2d_8/Relu 9.4 368.9 0.578 13 conv2d_9/Relu 18.9 529.9 0.800 14 conv2d_transpose/Relu 0.0 284.1 0.275 15 conv2d_10/Relu 37.7 545.3 1.291 16 conv2d_11/Relu 18.9 485.4 0.727 17 conv2d_transpose_1/Relu 0.0 178.9 0.262 18 conv2d_12/Relu 37.7 790.8 1.468 19 conv2d_13/Relu 18.9 744.1 0.780 20 conv2d_transpose_2/Relu 0.0 117.9 0.563 21 conv2d_14/Relu 37.7 1081.9 2.088 22 conv2d_15/Relu 18.9 948.7 1.191 23 conv2d_transpose_3/Relu 0.0 79.8 1.579 24 conv2d_16/Relu 37.7 1289.9 3.490 25 conv2d_17/Relu 18.9 1029.7 2.186 26 output/BiasAdd 0.8 471.2 0.531 -------------------------------------------------------------------------------------------- Total inference time 26.67 --------------------------------------------------------------------------------------------

 

xxxx@ubuntu:~/git/segmentation_unet/model$ mvNCCheck deployfinal.ckpt.meta -s 12 -on conv2d_13/Relu /usr/local/bin/ncsdk/Controllers/Parsers/TensorFlowParser/Convolution.py:44: SyntaxWarning: assertion is always true, perhaps remove parentheses? assert(False, "Layer type not supported by Convolution: " + obj.type) mvNCCheck v02.00, Copyright @ Intel Corporation 2017 shape: [1, 128, 128, 3] res.shape: (1, 32, 32, 32) TensorFlow output shape: (32, 32, 32) /usr/local/bin/ncsdk/Controllers/FileIO.py:65: UserWarning: You are using a large type. Consider reducing your data sizes for best performance Blob generated USB: Transferring Data... USB: Myriad Execution Finished USB: Myriad Connection Closing. USB: Myriad Connection Closed. Result: (32, 32, 32) 1) 27044 3.752 2) 26025 3.7402 3) 26020 3.6934 4) 25001 3.625 5) 2916 3.5957 Expected: (32, 32, 32) 1) 20912 53.0136 2) 21936 52.6455 3) 19952 52.3813 4) 19888 52.3808 5) 22000 51.612 ------------------------------------------------------------ Obtained values ------------------------------------------------------------ Obtained Min Pixel Accuracy: 100.0% (max allowed=2%), Fail Obtained Average Pixel Accuracy: 10.501158237457275% (max allowed=1%), Fail Obtained Percentage of wrong values: 46.2738037109375% (max allowed=0%), Fail Obtained Pixel-wise L2 error: 19.24379799670513% (max allowed=1%), Fail Obtained Global Sum Difference: 182420.984375 ------------------------------------------------------------

 

xxxx@ubuntu:~/git/segmentation_unet/model$ mvNCCheck deployfinal.ckpt.meta -s 12 -on conv2d_transpose_2/Relu /usr/local/bin/ncsdk/Controllers/Parsers/TensorFlowParser/Convolution.py:44: SyntaxWarning: assertion is always true, perhaps remove parentheses? assert(False, "Layer type not supported by Convolution: " + obj.type) mvNCCheck v02.00, Copyright @ Intel Corporation 2017 shape: [1, 128, 128, 3] res.shape: (1, 64, 64, 16) TensorFlow output shape: (64, 64, 16) /usr/local/bin/ncsdk/Controllers/FileIO.py:65: UserWarning: You are using a large type. Consider reducing your data sizes for best performance Blob generated USB: Transferring Data... USB: Myriad Execution Finished USB: Myriad Connection Closing. USB: Myriad Connection Closed. Result: (64, 64, 16) 1) 65535 nan 2) 65534 nan 3) 21853 nan 4) 21852 nan 5) 21851 nan Expected: (64, 64, 16) 1) 60884 47.6098 2) 60948 46.8669 3) 61012 45.4951 4) 60820 44.7391 5) 61076 44.1104 /usr/local/bin/ncsdk/Controllers/Metrics.py:75: RuntimeWarning: invalid value encountered in greater ------------------------------------------------------------ Obtained values ------------------------------------------------------------ Obtained Min Pixel Accuracy: nan% (max allowed=2%), Fail Obtained Average Pixel Accuracy: nan% (max allowed=1%), Fail Obtained Percentage of wrong values: 0.0% (max allowed=0%), Fail Obtained Pixel-wise L2 error: nan% (max allowed=1%), Fail Obtained Global Sum Difference: nan ------------------------------------------------------------
idata
Community Manager
52 Views