Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6506 Discussions

Yolov5m inference using GVAINFERENCE element

shekarneo
Beginner
1,533 Views

I was tried to use yolov5m model as suggested by the dlstreamer yolov5 guide for custom model and the accuracy was not good  here is the thread posted for the same : Accuracy drop in dlstreamer yolov5 custom model 

 

And now i tried to get the raw tensor outputs from yolov5m mode and tried to post process using original nms code from ultralytics repo, Now i am getting lot of false bounding boxes and actual objects are not detecting.

 

Please find the below code for the same

 

from collections import defaultdict
from pathlib import Path
import time
import sys
import gi
from datetime import datetime
import logging
import cv2
import numpy as np
import copy

gi.require_version("Gst", "1.0")
gi.require_version("GstApp", "1.0")
gi.require_version("GstVideo", "1.0")
gi.require_version("GObject", "2.0")
from gi.repository import Gst, GLib, GstApp, GstVideo
from gstgva import VideoFrame, util
import uuid
from datetime import datetime
from datetime import timedelta

import torch
from general import non_max_suppression, xywh2xyxy, xyxy2xywh, scale_boxes

Gst.init(sys.argv)

def bus_call(bus, message, loop):
    t = message.type
    if t == Gst.MessageType.EOS:
        sys.stdout.write("End-of-stream\n")
        loop.quit()
    elif t==Gst.MessageType.WARNING:
        err, debug = message.parse_warning()
        sys.stderr.write("Warning: %s: %s\n" % (err, debug))
    elif t == Gst.MessageType.ERROR:
        err, debug = message.parse_error()
        sys.stderr.write("Error: %s: %s\n" % (err, debug))
        loop.quit()
    return True


class FrameLogger:
    def __init__(self):
        # self.pipeline_string = "filesrc location=videos/occupancy-017.mp4 ! decodebin \
        #                         ! videoscale ! video/x-raw,width=640,height=640 \
        #                         ! videoconvert ! capsfilter caps=video/x-raw,format=BGR \
        #                         ! queue ! gvainference model=./models/yolov5m_openvino_model/yolov5m.xml device=CPU inference-interval=1 model_proc=./models/model_proc/yolo-v5_80-raw.json name=gvainference inference-region=full-frame \
        #                         ! queue ! gvatrack tracking-type=short-term-imageless \
        #                         ! queue ! gvawatermark name=gvawatermark \
        #                         ! queue ! videoconvert n-threads=4 ! fpsdisplaysink sync=false"
        self.pipeline_string = "filesrc location=videos/occupancy-017.mp4 ! decodebin \
                                ! videoscale ! video/x-raw,width=640,height=640 \
                                ! videoconvert ! capsfilter caps=video/x-raw,format=BGR \
                                ! queue ! gvainference model=./models/yolov5m_openvino_model/yolov5m.xml device=CPU inference-interval=1 model_proc=./models/model_proc/yolo-v5_80-raw.json name=gvainference reshape-height=640 reshape-width=640 inference-region=full-frame \
                                ! queue ! gvatrack tracking-type=short-term-imageless \
                                ! queue ! gvawatermark name=gvawatermark \
                                ! queue ! videoconvert n-threads=4 ! fpsdisplaysink sync=false"
        self.detected_frames = {}
        self.frame_number=0
        self.names = ["person", "head"]
        self.conf_thres = 0.4
        self.iou_thres = 0.4

    def frame_callback(self, frame):
        detected_ids = []
        self.frame_number+=1
        ts = frame._VideoFrame__buffer.pts
        with frame.data() as img:
            im = img.copy()
            if len(im.shape) == 3:
                im = im[None]  # expand for batch dim
            gn = torch.tensor(img.shape)[[1, 0, 1, 0]]  # normalization gain whwh
            for tensor in frame.tensors(): 
                if tensor.layer_name() == "output0":
                    _shape = tensor.dims()
                    print("_shape: ", _shape)
                    output = tensor.data().reshape(_shape)
                    output = torch.from_numpy(output).to("cpu")
                    output = non_max_suppression(output, self.conf_thres, self.iou_thres, classes=[0], agnostic=False, max_det=1000)
                    print("after nms shape: ", output[0].shape)
                    # Process predictions
                    for i, det in enumerate(output):  # per image
                        print("total detections: ",len(det))
                        if len(det):
                            # Rescale boxes from img_size to im0 size
                            # det[:, :4] = scale_boxes(im.shape[2:], det[:, :4], (img.shape)).round()
                            for *xyxy, conf, cls in reversed(det):
                                xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()  # normalized xywh
                                x, y, w, h = xywh
                                # print("gva xywh", xywh)
                                x1, y1, x2, y2 = xyxy
                                
                                try:
                                    abs_x = x * frame.video_info().width 
                                    abs_y = y * frame.video_info().height
                                    abs_w = w * frame.video_info().width 
                                    abs_h = h * frame.video_info().height

                                    top_left_x = abs_x - (abs_w / 2) 
                                    top_left_y = abs_y - (abs_h / 2) 
                                    bottom_right_x = abs_x + (abs_w / 2) 
                                    bottom_right_y = abs_y + (abs_h / 2)
                                    cv2.rectangle(img, (int(x1),int(y1)),\
                                        (int(x2),int(y2)), (0, 255, 0), 2)
                                    # label_id = int(cls.tolist())
                                    # roi = frame.add_region(int(top_left_x),int(top_left_y),int(abs_w),int(abs_h), self.names[label_id], conf.tolist(), normalized=False)
                                    # roi.detection()['label_id'] = label_id
                                except Exception as e:
                                    print(e)
                            
        return True


    def detect_pad_probe_callback(self, pad, info):
        with util.GST_PAD_PROBE_INFO_BUFFER(info) as buffer:
            caps = pad.get_current_caps()
            frame = VideoFrame(buffer, caps=caps)
            image_width = frame.video_info().width
            image_height = frame.video_info().height
            self.frame_callback(frame)
        return Gst.PadProbeReturn.OK
    
    def watermark_pad_probe_callback(self, pad, info):
        with util.GST_PAD_PROBE_INFO_BUFFER(info) as buffer:
            caps = pad.get_current_caps()
            frame = VideoFrame(buffer, caps=caps)
            image_width = frame.video_info().width
            image_height = frame.video_info().height
            self.frame_callback(frame)
        return Gst.PadProbeReturn.OK
    
    # def on_message(self, bus: Gst.Bus, message: Gst.Message, loop: GLib.MainLoop):
    #     mtype = message.type

    #     if mtype == Gst.MessageType.EOS:
    #         self.cv_plot()
    #         self.pipeline.set_state(Gst.State.NULL)


    def run(self):
        print("Run")
        pipeline = Gst.parse_launch(self.pipeline_string)
        loop = GLib.MainLoop()
        bus = pipeline.get_bus()
        bus.add_signal_watch()
        bus.connect("message", bus_call, loop)

        detect_pad = pipeline.get_by_name("gvainference")
        if detect_pad:
            pad = detect_pad.get_static_pad("src")
            pad.add_probe(Gst.PadProbeType.BUFFER, self.detect_pad_probe_callback)

        # gvawatermark = pipeline.get_by_name("gvawatermark")
        # if gvawatermark:
        #     pad = gvawatermark.get_static_pad("sink")
        #     pad.add_probe(Gst.PadProbeType.BUFFER, self.watermark_pad_probe_callback)
        # graypad = pipeline.get_by_name("gray")
        # if graypad:
        #     pad = graypad.get_static_pad("src")
        #     pad.add_probe(Gst.PadProbeType.BUFFER, self.pad_probe_callback)


        pipeline.set_state(Gst.State.PLAYING)
        try:
            print("loop")
            loop.run()
        except Exception as e:
            logging.error("Exception")
            loop.quit()

        pipeline.set_state(Gst.State.NULL)

obj = FrameLogger()
obj.run()
0 Kudos
8 Replies
Iffa_Intel
Moderator
1,481 Views

Hi,

 

before proceeding further, may I know if there's any particular reason for using YOLOv5?

The latest version of YOLO is v8 which you may refer here.

 

 

Cordially,

Iffa

0 Kudos
shekarneo
Beginner
1,479 Views

Hi Iffa,

 

I am using custom model with two classes which was trained using yolov5.

Will definitely try yolov8 but now i need to make it working with yolov5.

 

And also i need to run this or yolov8 model in dlstreamer either with gvadetect or gvainference!! 

currently using gvapython but here i am not able to access the pipeline and some functions properly

0 Kudos
Iffa_Intel
Moderator
1,467 Views

If that's the case, could you share your model files for us to check? (if possible).



Cordially,

Iffa


0 Kudos
shekarneo
Beginner
1,462 Views

Hi Iffa,

 

I am facing the same issue with both yolov5m custom model and pretrained 80 class model from ultralytics repo.

Please find the model and other files here  

0 Kudos
Iffa_Intel
Moderator
1,407 Views

I checked your model using benchmark app, just to see its general performance and it seems okay.

 

Iffa_Intel_0-1707112641042.png

 

From OpenVINO perspective,  it's recommended to refer to this 111-yolov5-quantization-migration as reference.

If you are planning to use custom codes from Ultralytics, it's best to contact the expert from Ultralytics which is here.

 

Cordially,

Iffa

 

0 Kudos
shekarneo
Beginner
1,402 Views

Hi Iffa,

 

That is the problem here if i use pretrained 80 class model using gvadetect and model proc it is giving good results, the same is not working with custom models as the issues is described here Accuracy drop in dlstreamer yolov5 custom model 

 

So i am trying with gvainference.

0 Kudos
Iffa_Intel
Moderator
1,396 Views

Could you share:

  1. Steps that you did to convert, optimize, and quantize the model?
  2. Which OpenVINO sample code you are referring to that produces the issue?.
  3. Steps that you did till the point of error



Cordially,

Iffa


0 Kudos
Iffa_Intel
Moderator
1,282 Views

Hi,


Thank you for your question. If you need any additional information from Intel, please submit a new question as Intel is no longer monitoring this thread


Cordially,

Iffa


0 Kudos
Reply