Re: Resnet-10 and Resnet-50 both hang device - Help would be appreciated

idata · ‎08-01-2018

I'm working on pervasive device that had support for the caffe framework and I'm attempting to use resnet-10 from opencv to do face detection (https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector). The graph file compiles fine, and the first pass even executes fine and then the device hangs. The following is the trace I'm getting from the device

[INFO] Get Tensor Descriptor Structures

[INFO] create input and output Fifos

I: [ 0] ncFifoCreate:2126 Init fifo

I: [ 0] ncFifoAllocate:2321 Creating fifo

[INFO] return input and output Fifos

[INFO] Sending Image to NCS device]

D: [ 0] ncFifoWriteElem:2592 No layout conversion is needed 0

D: [ 0] convertDataTypeAndLayout:170 src data type 1 dst data type 0

D: [ 0] convertDataTypeAndLayout:172 SRC: w 300 h 300 c 3 w_s 12 h_s 3600 c_s 4

D: [ 0] convertDataTypeAndLayout:174 DST: w 300 h 300 c 3 w_s 6 h_s 1800 c_s 2

D: [ 0] ncFifoWriteElem:2617 write count 0 num_elements 2 userparam 0x75a86418

[INFO Sendiing to NCS Queue]

I: [ 0] ncGraphQueueInference:3048 trigger start

I: [ 0] ncGraphQueueInference:3145 trigger end

D: [ 0] ncFifoReadElem:2686 No layout conversion is needed 0

D: [ 0] convertDataTypeAndLayout:170 src data type 0 dst data type 1

D: [ 0] convertDataTypeAndLayout:172 SRC: w 7 h 201 c 1 w_s 2 h_s 14 c_s 2

D: [ 0] convertDataTypeAndLayout:174 DST: w 7 h 201 c 1 w_s 4 h_s 28 c_s 4

D: [ 0] ncFifoReadElem:2714 num_elements 2 userparam 0x75a86558 output length 5628

[INFO] Image prdictions returned from NCS

1

[INFO] Sending Image to NCS device]

D: [ 0] ncFifoWriteElem:2592 No layout conversion is needed 0

D: [ 0] convertDataTypeAndLayout:170 src data type 1 dst data type 0

D: [ 0] convertDataTypeAndLayout:172 SRC: w 300 h 300 c 3 w_s 12 h_s 3600 c_s 4

D: [ 0] convertDataTypeAndLayout:174 DST: w 300 h 300 c 3 w_s 6 h_s 1800 c_s 2

D: [ 0] ncFifoWriteElem:2617 write count 1 num_elements 2 userparam 0x75a86418

[INFO Sendiing to NCS Queue]

I: [ 0] ncGraphQueueInference:3048 trigger start

E: [ 0] dispatcherEventReceive:200 dispatcherEventReceive() Read failed -1

I notice in the SDK readme that this model had not been tested but Resnet-50 had, so I tried that Model with the same results. The trace from that run can be found below

[INFO] Get Tensor Descriptor Structures

[INFO] create input and output Fifos

I: [ 0] ncFifoCreate:2126 Init fifo

I: [ 0] ncFifoAllocate:2321 Creating fifo

[INFO] return input and output Fifos

[INFO] Sending Image to NCS device]

D: [ 0] ncFifoWriteElem:2592 No layout conversion is needed 0

D: [ 0] convertDataTypeAndLayout:170 src data type 1 dst data type 0

D: [ 0] convertDataTypeAndLayout:172 SRC: w 224 h 224 c 3 w_s 12 h_s 2688 c_s 4

D: [ 0] convertDataTypeAndLayout:174 DST: w 224 h 224 c 3 w_s 6 h_s 1344 c_s 2

D: [ 0] ncFifoWriteElem:2617 write count 0 num_elements 2 userparam 0x75ac4468

[INFO Sendiing to NCS Queue]

I: [ 0] ncGraphQueueInference:3048 trigger start

I: [ 0] ncGraphQueueInference:3145 trigger end

D: [ 0] ncFifoReadElem:2686 No layout conversion is needed 1

D: [ 0] convertDataTypeAndLayout:170 src data type 0 dst data type 1

D: [ 0] convertDataTypeAndLayout:172 SRC: w 1 h 1 c 1000 w_s 2000 h_s 2000 c_s 2

D: [ 0] convertDataTypeAndLayout:174 DST: w 1 h 1 c 1000 w_s 4000 h_s 4000 c_s 4

D: [ 0] ncFifoReadElem:2714 num_elements 2 userparam 0x75ac45a8 output length 4000

[INFO] Image prdictions returned from NCS

0

[INFO] Sending Image to NCS device]

D: [ 0] ncFifoWriteElem:2592 No layout conversion is needed 0

D: [ 0] convertDataTypeAndLayout:170 src data type 1 dst data type 0

D: [ 0] convertDataTypeAndLayout:172 SRC: w 224 h 224 c 3 w_s 12 h_s 2688 c_s 4

D: [ 0] convertDataTypeAndLayout:174 DST: w 224 h 224 c 3 w_s 6 h_s 1344 c_s 2

D: [ 0] ncFifoWriteElem:2617 write count 1 num_elements 2 userparam 0x75ac4468

[INFO Sendiing to NCS Queue]

I: [ 0] ncGraphQueueInference:3048 trigger start

E: [ 0] dispatcherEventReceive:200 dispatcherEventReceive() Read failed -1

The Device works fine for other SSD models I have tried so I don't think I have a hardware problem. The frustrating thing on this problem is it works the first pass fine. Any help would be greatly appreciated.

idata · ‎08-01-2018

@sggriset Thanks for reporting this issue. I've also seen these hangs with the same networks and they usually happen during fifo.read_elem(). Can you provide your python code for testing and issue reproduction? Thanks.

idata · ‎08-02-2018

Thank you for your quick response. The following code is set up for resnet-10 SSD from opencv. If you adjust the dimensions you can test resnet-50 also. Assuming you have the graph files so here you go

# import necessary libs
import usb.core  # from pyusb
import usb.util  # from pyusb
import mvnc.mvncapi as mvnc
from threading import Thread
import numpy as np
import time
import cv2
import os


VID = 0x03e7
PID = 0x2150

dim_prep = (300, 300)
graph_path = os.path.relpath("graphs/facecvgraph")
out_fifo = 'faceout'
in_fifo = 'facein'
graph_name = 'facegraph'


class WebcamVideoStream:
    def __init__(self, src = 0):
        # initialize the video camera stream and read the frist frame
        # from the stream
        self.stream = cv2.VideoCapture(src)
        # camera returns a tuple grabbed = boolean value whether the frame was read 
        # frame = the video frame itself
        (self.grabbed, self.frame) = self.stream.read()
        # initialize the variable used to indicate if the thread should
        # be stopped
        self.stopped = False

    def start(self):
        # start the thread to read frames from the video stream
        Thread(target = self.update, args=()).start()
        return self

    def update(self):
        # keep looping infinitely until the tread is stopped
        while True:
            # if the thread indicator variable is set, stop the thread
            if self.stopped:
                return
            # otherwise read the next frame from the stream
            (self.grabbed, self.frame) = self.stream.read()

    def read(self):
        # return the frame most recently read
        return self.frame

    def stop(self):
        # indicate that the thread should be stopped
        self.stopped = True

def preprocess_image(input_image, dim):
    # preprocess the image
    preprocessed = cv2.resize(input_image, dim)
    preprocessed = preprocessed - 127.5
    preprocessed = preprocessed * 0.007843
    preprocessed = preprocessed.astype(np.float32)

    # return the image to the callling function
    return preprocessed

# Load graph files and return graph object
def load_graph(device, Graphpath, graph_name):
    # Initalize a graph
    print(Graphpath)
    print("[INFO] loading the graph file into NCS memory")
    with open(Graphpath, mode='rb') as f:
        graph_buffer = f.read()

    print("[INFO] initialize graph object ")
    # Initalize a graph object
    graph = mvnc.Graph(graph_name)

    # Allocate the graph to the device
    print("[INFO] allocating the graph to the NCS")
    graph.allocate(device, graph_buffer)

    # return graph object
    return graph

def get_tensors(graph, device, input_name, output_name):
    print("[INFO] Get Tensor Descriptor Structures")
    input_descriptors = graph.get_option(mvnc.GraphOption.RO_INPUT_TENSOR_DESCRIPTORS)
    output_descriptors = graph.get_option(mvnc.GraphOption.RO_OUTPUT_TENSOR_DESCRIPTORS)

    print("[INFO] create input and output Fifos")
    input_fifo = mvnc.Fifo(input_name, mvnc.FifoType.HOST_WO)
    output_fifo = mvnc.Fifo(output_name, mvnc.FifoType.HOST_RO)
    input_fifo.allocate(device, input_descriptors[0], 2)
    output_fifo.allocate(device, output_descriptors[0], 2)

    print("[INFO] return input and output Fifos")
    return  input_fifo, output_fifo

def predict(image, graph, input_fifo, output_fifo):
    # preprocess the image
    # send the image to the NCS and run a forward-pass to grab the
    # netowrk predictions
    print("[INFO] Sending Image to NCS device]")
    input_fifo.write_elem(image, None)
    print("[INFO Sendiing to NCS Queue]")
    graph.queue_inference(input_fifo, output_fifo)
    (output, _ ) = output_fifo.read_elem()
    print("[INFO] Image predictions returned from NCS")
     # grab the number of valid object predictions from the output
    num_valid_boxes = int(output[0])
    predictions = []

    for box_index in range(num_valid_boxes):

         # calculate the base index into our array so we can extract
         # bound box information
        base_index = 7 + box_index * 7


         # boxes with non-finite(inf, nan, etc) numbers must be
         # ignored
        if(not np.isfinite(output[base_index]) or
          not np.isfinite(output[base_index + 1]) or
          not np.isfinite(output[base_index + 2]) or
          not np.isfinite(output[base_index + 3]) or
          not np.isfinite(output[base_index + 4]) or
          not np.isfinite(output[base_index + 5]) or
          not np.isfinite(output[base_index + 6])):
          continue

        # extract the image width and height and clip the boxes to the
        # image size in case network returns boxes outside of the image
        # boundaries
        (h, w) = image.shape[:2]
        x1 = max(0, int(output[base_index + 3] * w))
        y1 = max(0, int(output[base_index + 4] * h))
        x2 = min(w, int(output[base_index + 5] * w))
        y2 = min(h, int(output[base_index + 6] * h))

        # grab the prediction class label, confidence,
        # and bounding box (x, y) - coordinates
        pred_class = int(output[base_index + 1])
        pred_conf = output[base_index + 2]
        pred_boxpts = ((x1, y1), (x2, y2))

        # create prediction tuple and append the prediction to the
        # predictions list
        prediction = (pred_class, pred_conf, pred_boxpts)
        predictions.append(prediction)
    print(len(predictions))
    # return the list of predictions to the calling function
    return predictions



def main():
    mob_device = usb.core.find(idVendor=VID, idProduct=PID)
    if not mob_device:
        print ("[INFO] Movidius Device Not Found")
        exit(1)
    print ("[INFO] Found Movidius Device")
    # set the logging level for the Movidius device
    mvnc.global_set_option(mvnc.GlobalOption.RW_LOG_LEVEL, 0)

    devices = mvnc.enumerate_devices()
    if(len(devices) < 1):
        print("[INFO] Error - no Movidius devices detected")
        exit(1)

    # get the first NCS device
    dev = mvnc.Device(devices[0])

    try:
        dev.open()
    except:
        print("[INFO] Error - Could not open NCS device.")
        exit(1)
    # load graph file
    my_graph = load_graph(dev, graph_path, graph_name)

     # set up Fifos
    inputfifo, outputfifo = get_tensors(my_graph, dev, in_fifo, out_fifo)

    vs = WebcamVideoStream(src = 0).start()
    # let camera warm up
    time.sleep(2.0)
    while True:
            frame = vs.read()
            frame = preprocess_image(frame, dim_prep)
            key =  key = cv2.waitKey(1) & 0xFF
            if key == 'q':
                break
            predictions  = predict(frame, my_graph, inputfifo, outputfifo)
            print(len(predictions))

    cv2.destroyAllWindows()
    vs.stop()
    input_fifo.destroy()
    output_fifo.destroy()
    dev.destroy()
    exit(0)

if __name__ == '__main__':
    main()

idata · ‎08-03-2018

Tome_at_Intel

Any progress on reproduction of problem? Do you have what you need to reproduce the problem? Or do you need to supply you the graph files and model?

Let me know again thank you for your help

idata · ‎08-03-2018

@sggriset If you could also provide the model file with the prototxt file, it would be helpful to me. Thanks again.

idata · ‎08-03-2018

Tome_at_Intel

You can find the caffe model file at the following url

https://github.com/opencv/opencv_3rdparty/tree/dnn_samples_face_detector_20170830

You can find the prototxt file (deploy.prototxt) file at the following url https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector

Let me know what else I can do for you. Thank you for your help and enjoy your weekend.

Steve

idata · ‎08-08-2018

Hello, I'm interested too, this model runs fine on OpenCV.

Thanks!

idata · ‎08-09-2018

Any update on this issue would be greatly appreciated

idata · ‎08-23-2018

I'm interested too. Any update on this issue would be greatly appreciated

idata · ‎08-23-2018

Still working on this issue. Hang tight guys.

idata · ‎08-27-2018

Tome_at_Intel

Any update?

idata · ‎08-29-2018

No updates to report as of right now. Still looking into the root cause.

idata · ‎09-17-2018

Hello，@Tome_at_Intel, any update on this issue?

idata · ‎09-17-2018

@xzongbao We apologize that this issue is affecting our users, however at the moment, we don't have any updates that we can share yet. We greatly appreciate your patience while we work on resolving this issue.

idata · ‎10-10-2018

@xzongbao @sggriset Please try running your models with the latest NCSDK v 2.08 and let me know if your issue is resolved. Thanks.

idata · ‎10-15-2018

Tome_at_Intel

Sorry for the slow response I had to rebuild my systems to test NCSDK v 2.08 release. It worked fine but it does require you to rebuild the graph file under NCSDK v 2.08. But thank you to Intel team for your fix

Steve

idata · ‎11-09-2018

@Tome_at_Intel

I have upgraded to NCSDK v2.08. It‘s ok to run the demo ResNet-18 of the ncappzoo. But for my ResNet-34 model, a compilation error has occurred. The compilation is OK before upgrading.

Network Input tensors ['data#162']

Network Output tensors ['prob#323']

Traceback (most recent call last):

File "/usr/local/bin/mvNCCompile", line 206, in

create_graph(args.network, args.image, args.inputnode, args.outputnode, args.outfile, args.nshaves, args.inputsize, args.weights, args.explicit_concat, args.ma2480, args.scheduler, args.new_parser, args.cpp, args)

File "/usr/local/bin/mvNCCompile", line 185, in create_graph

load_ret = load_network(args, parser, myriad_config)

File "/usr/local/bin/ncsdk/Controllers/Scheduler.py", line 130, in load_network

net, graphFile, finalLayers = serializeNewFmt(parsedLayers, arguments, myriad_conf, input_data)

File "/usr/local/bin/ncsdk/Controllers/Parsers/Phases.py", line 508, in serializeNewFmt

adp.transform(parsedLayers) # TODO: DFS Transform?

File "/usr/local/bin/ncsdk/Controllers/Adaptor.py", line 200, in transform

self.net.attach(NetworkStageEmulator(l))

File "/usr/local/bin/ncsdk/Controllers/Adaptor.py", line 325, in init

self.specific_fields()

File "/usr/local/bin/ncsdk/Controllers/Adaptor.py", line 333, in specific_fields

self.definition.adapt_fields(self, _or)

File "/usr/local/bin/ncsdk/Models/StageDefinitions/Relu_Op.py", line 50, in adapt_fields

emulator.dataBUF = BufferEmulator(i.resolve())

File "/usr/local/bin/ncsdk/Controllers/Tensor.py", line 200, in resolve

rt = ResolvedTensor(self)

File "/usr/local/bin/ncsdk/Controllers/Tensor.py", line 310, in init

self.__data = np.zeros(tensor.shape)

MemoryError

Makefile:52: recipe for target 'compile' failed

make: *** [compile] Error 1