Re: Yolov5m inference using GVAINFERENCE element

shekarneo · ‎02-01-2024

I was tried to use yolov5m model as suggested by the dlstreamer yolov5 guide for custom model and the accuracy was not good here is the thread posted for the same : Accuracy drop in dlstreamer yolov5 custom model

And now i tried to get the raw tensor outputs from yolov5m mode and tried to post process using original nms code from ultralytics repo, Now i am getting lot of false bounding boxes and actual objects are not detecting.

Please find the below code for the same

from collections import defaultdict
from pathlib import Path
import time
import sys
import gi
from datetime import datetime
import logging
import cv2
import numpy as np
import copy

gi.require_version("Gst", "1.0")
gi.require_version("GstApp", "1.0")
gi.require_version("GstVideo", "1.0")
gi.require_version("GObject", "2.0")
from gi.repository import Gst, GLib, GstApp, GstVideo
from gstgva import VideoFrame, util
import uuid
from datetime import datetime
from datetime import timedelta

import torch
from general import non_max_suppression, xywh2xyxy, xyxy2xywh, scale_boxes

Gst.init(sys.argv)

def bus_call(bus, message, loop):
    t = message.type
    if t == Gst.MessageType.EOS:
        sys.stdout.write("End-of-stream\n")
        loop.quit()
    elif t==Gst.MessageType.WARNING:
        err, debug = message.parse_warning()
        sys.stderr.write("Warning: %s: %s\n" % (err, debug))
    elif t == Gst.MessageType.ERROR:
        err, debug = message.parse_error()
        sys.stderr.write("Error: %s: %s\n" % (err, debug))
        loop.quit()
    return True


class FrameLogger:
    def __init__(self):
        # self.pipeline_string = "filesrc location=videos/occupancy-017.mp4 ! decodebin \
        #                         ! videoscale ! video/x-raw,width=640,height=640 \
        #                         ! videoconvert ! capsfilter caps=video/x-raw,format=BGR \
        #                         ! queue ! gvainference model=./models/yolov5m_openvino_model/yolov5m.xml device=CPU inference-interval=1 model_proc=./models/model_proc/yolo-v5_80-raw.json name=gvainference inference-region=full-frame \
        #                         ! queue ! gvatrack tracking-type=short-term-imageless \
        #                         ! queue ! gvawatermark name=gvawatermark \
        #                         ! queue ! videoconvert n-threads=4 ! fpsdisplaysink sync=false"
        self.pipeline_string = "filesrc location=videos/occupancy-017.mp4 ! decodebin \
                                ! videoscale ! video/x-raw,width=640,height=640 \
                                ! videoconvert ! capsfilter caps=video/x-raw,format=BGR \
                                ! queue ! gvainference model=./models/yolov5m_openvino_model/yolov5m.xml device=CPU inference-interval=1 model_proc=./models/model_proc/yolo-v5_80-raw.json name=gvainference reshape-height=640 reshape-width=640 inference-region=full-frame \
                                ! queue ! gvatrack tracking-type=short-term-imageless \
                                ! queue ! gvawatermark name=gvawatermark \
                                ! queue ! videoconvert n-threads=4 ! fpsdisplaysink sync=false"
        self.detected_frames = {}
        self.frame_number=0
        self.names = ["person", "head"]
        self.conf_thres = 0.4
        self.iou_thres = 0.4

    def frame_callback(self, frame):
        detected_ids = []
        self.frame_number+=1
        ts = frame._VideoFrame__buffer.pts
        with frame.data() as img:
            im = img.copy()
            if len(im.shape) == 3:
                im = im[None]  # expand for batch dim
            gn = torch.tensor(img.shape)[[1, 0, 1, 0]]  # normalization gain whwh
            for tensor in frame.tensors(): 
                if tensor.layer_name() == "output0":
                    _shape = tensor.dims()
                    print("_shape: ", _shape)
                    output = tensor.data().reshape(_shape)
                    output = torch.from_numpy(output).to("cpu")
                    output = non_max_suppression(output, self.conf_thres, self.iou_thres, classes=[0], agnostic=False, max_det=1000)
                    print("after nms shape: ", output[0].shape)
                    # Process predictions
                    for i, det in enumerate(output):  # per image
                        print("total detections: ",len(det))
                        if len(det):
                            # Rescale boxes from img_size to im0 size
                            # det[:, :4] = scale_boxes(im.shape[2:], det[:, :4], (img.shape)).round()
                            for *xyxy, conf, cls in reversed(det):
                                xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()  # normalized xywh
                                x, y, w, h = xywh
                                # print("gva xywh", xywh)
                                x1, y1, x2, y2 = xyxy
                                
                                try:
                                    abs_x = x * frame.video_info().width 
                                    abs_y = y * frame.video_info().height
                                    abs_w = w * frame.video_info().width 
                                    abs_h = h * frame.video_info().height

                                    top_left_x = abs_x - (abs_w / 2) 
                                    top_left_y = abs_y - (abs_h / 2) 
                                    bottom_right_x = abs_x + (abs_w / 2) 
                                    bottom_right_y = abs_y + (abs_h / 2)
                                    cv2.rectangle(img, (int(x1),int(y1)),\
                                        (int(x2),int(y2)), (0, 255, 0), 2)
                                    # label_id = int(cls.tolist())
                                    # roi = frame.add_region(int(top_left_x),int(top_left_y),int(abs_w),int(abs_h), self.names[label_id], conf.tolist(), normalized=False)
                                    # roi.detection()['label_id'] = label_id
                                except Exception as e:
                                    print(e)
                            
        return True


    def detect_pad_probe_callback(self, pad, info):
        with util.GST_PAD_PROBE_INFO_BUFFER(info) as buffer:
            caps = pad.get_current_caps()
            frame = VideoFrame(buffer, caps=caps)
            image_width = frame.video_info().width
            image_height = frame.video_info().height
            self.frame_callback(frame)
        return Gst.PadProbeReturn.OK
    
    def watermark_pad_probe_callback(self, pad, info):
        with util.GST_PAD_PROBE_INFO_BUFFER(info) as buffer:
            caps = pad.get_current_caps()
            frame = VideoFrame(buffer, caps=caps)
            image_width = frame.video_info().width
            image_height = frame.video_info().height
            self.frame_callback(frame)
        return Gst.PadProbeReturn.OK
    
    # def on_message(self, bus: Gst.Bus, message: Gst.Message, loop: GLib.MainLoop):
    #     mtype = message.type

    #     if mtype == Gst.MessageType.EOS:
    #         self.cv_plot()
    #         self.pipeline.set_state(Gst.State.NULL)


    def run(self):
        print("Run")
        pipeline = Gst.parse_launch(self.pipeline_string)
        loop = GLib.MainLoop()
        bus = pipeline.get_bus()
        bus.add_signal_watch()
        bus.connect("message", bus_call, loop)

        detect_pad = pipeline.get_by_name("gvainference")
        if detect_pad:
            pad = detect_pad.get_static_pad("src")
            pad.add_probe(Gst.PadProbeType.BUFFER, self.detect_pad_probe_callback)

        # gvawatermark = pipeline.get_by_name("gvawatermark")
        # if gvawatermark:
        #     pad = gvawatermark.get_static_pad("sink")
        #     pad.add_probe(Gst.PadProbeType.BUFFER, self.watermark_pad_probe_callback)
        # graypad = pipeline.get_by_name("gray")
        # if graypad:
        #     pad = graypad.get_static_pad("src")
        #     pad.add_probe(Gst.PadProbeType.BUFFER, self.pad_probe_callback)


        pipeline.set_state(Gst.State.PLAYING)
        try:
            print("loop")
            loop.run()
        except Exception as e:
            logging.error("Exception")
            loop.quit()

        pipeline.set_state(Gst.State.NULL)

obj = FrameLogger()
obj.run()

Iffa_Intel · ‎02-01-2024

Hi,

before proceeding further, may I know if there's any particular reason for using YOLOv5?

The latest version of YOLO is v8 which you may refer here.

Cordially,

Iffa

shekarneo · ‎02-01-2024

Hi Iffa,

I am using custom model with two classes which was trained using yolov5.

Will definitely try yolov8 but now i need to make it working with yolov5.

And also i need to run this or yolov8 model in dlstreamer either with gvadetect or gvainference!!

currently using gvapython but here i am not able to access the pipeline and some functions properly

Iffa_Intel · ‎02-01-2024

If that's the case, could you share your model files for us to check? (if possible).

Cordially,

Iffa

shekarneo · ‎02-01-2024

Hi Iffa,

I am facing the same issue with both yolov5m custom model and pretrained 80 class model from ultralytics repo.

Please find the model and other files here

Iffa_Intel · ‎02-04-2024

I checked your model using benchmark app, just to see its general performance and it seems okay.

From OpenVINO perspective, it's recommended to refer to this 111-yolov5-quantization-migration as reference.

If you are planning to use custom codes from Ultralytics, it's best to contact the expert from Ultralytics which is here.

Cordially,

Iffa

shekarneo · ‎02-04-2024

Hi Iffa,

That is the problem here if i use pretrained 80 class model using gvadetect and model proc it is giving good results, the same is not working with custom models as the issues is described here Accuracy drop in dlstreamer yolov5 custom model

So i am trying with gvainference.

Iffa_Intel · ‎02-04-2024

Could you share:

Steps that you did to convert, optimize, and quantize the model?
Which OpenVINO sample code you are referring to that produces the issue?.
Steps that you did till the point of error

Cordially,

Iffa

Iffa_Intel · ‎02-14-2024

Hi,

Thank you for your question. If you need any additional information from Intel, please submit a new question as Intel is no longer monitoring this thread

Cordially,

Iffa