Showing results for 
Search instead for 
Did you mean: 

Tiny Yolov2 model not detecting properly on NCS

The model was trained in Darknet. It is based off of Tiny Yolov2, the only modification to the architecture was that the model only detects a single class instead of the default 80. It has been tested using Darkflow and is known to have good performance.

I first convert from Darknet to Tensorflow using Darkflow:

./flow --model cherry.cfg --load cherry_final.weights --savepb

This works and does not give me an error. It creates a .pb file and a .meta file. I next convert from Tensorflow to IR format using the model optimizer:

sudo python3 --input_model cherry.pb --tensorflow_use_custom_operations_config cherry.json --input_shape "[1,1280,704,3]" --data_type FP16

This also works fine and does not return an error. It creates a .bin file and a .xml file. My json contents are as follows:

    "id": "TFYOLO",
    "match_kind": "general",
    "custom_attributes": {
      "classes": 1,
      "coords": 4,
      "num": 5,
      "do_softmax": 1

Finally, I load my generated IR files and attempt to classify a sample image, using the following script:

from __future__ import division
import cv2 as cv
import numpy as np
import sys

# Load the model  
net = cv.dnn.readNet('cherry.xml', 'cherry.bin') 

# Specify target device 
# Read an image
frame = cv.imread('test.png')

img = (frame.copy()).astype(np.float32)
img = np.divide(img, 255)
img = img[:, :, ::-1]
input_image = img.astype(np.uint8)

# Prepare input blob and perform an inference 
blob = cv.dnn.blobFromImage(input_image, size=(1280, 704), ddepth=cv.CV_8U) 
out = net.forward()

detection = out.reshape(5,6,40,22)
for i in range(6):
    for j in range(40):
        for k in range(22):
            confidence = float(detection[4,i,j,k]) 
            xmin = int((detection[0,i,j,k] - detection[2,i,j,k]/2) * frame.shape[0]) 
            ymin = int((detection[1,i,j,k] - detection[3,i,j,k]/2) * frame.shape[1]) 
            xmax = int((detection[0,i,j,k] + detection[2,i,j,k]/2) * frame.shape[0]) 
            ymax = int((detection[1,i,j,k] + detection[3,i,j,k]/2) * frame.shape[1])
            if confidence > 0.8:
                cv.rectangle(frame, (xmin, ymin), (xmax, ymax), color=(0, 255, 0))

Attempting to run this on an NCS2 results in the following error.

Traceback (most recent call last):
  File "", line 27, in <module>
    out = net.forward()
cv2.error: OpenCV(4.0.1) /home/odroid/opencv-4.0.1/modules/dnn/src/op_inf_engine.cpp:555: error: 
(-215:Assertion failed) Failed to initialize Inference Engine backend: [VPU] Internal error: 
Output in 17-maxpool has incorrect width dimension. Expected: 21 or 21 Actual: 22 in function 'initPlugin'

If I run it on an NCS1, I don't get an error, but the detections it makes on the test image are not correct. The network should be able to identify people in the image. I have also tested with another image with only one person but I still get seemingly random results. Again this is a model that had no issues detecting people when tested in Darknet.

I tried running it on the CPU (cv.dnn.DNN_TARGET_CPU) but I get this error:

Traceback (most recent call last):
  File "", line 27, in <module>
    out = net.forward()
cv2.error: OpenCV(4.0.1) /home/odroid/opencv-4.0.1/modules/dnn/src/op_inf_engine.cpp:555: error: 
(-215:Assertion failed) in function 'initPlugin'
> Failed to initialize Inference Engine backend: Cannot find plugin to use :


Inferences are being run on Ubuntu 16.04, Openvino version is 5.455. Any assistance you can provide would be highly appreciated. Thanks.

0 Kudos
1 Reply

Dearest Jon3848, 

Instead of using yolo2 I encourage you to 1) download the latest 2019 R1 OpenVino Release and 2) use yolo3.  Intel has developed samples specifically for yolo3, in both Python and C++.  I'm sure you are aware that yolo3 is an improvement over yolo2.

Thanks for using OpenVino !


0 Kudos