I am using person-detection-action-recognition-0005 pre-trained model from openvino to detect the person and their action.
From this documentation, I wrote a python script to get detections.
This is the script.
import cv2 def main(): print(cv2.__file__) frame = cv2.imread('/home/naveen/Downloads/person.jpg') actionNet = cv2.dnn.readNet('person-detection-action-recognition-0005.bin', 'person-detection-action-recognition-0005.xml') actionBlob = cv2.dnn.blobFromImage(frame, size=(680, 400)) actionNet.setInput(actionBlob) # detection output actionOut = actionNet.forward(['mbox_loc1/out/conv/flat', 'mbox_main_conf/out/conv/flat/softmax/flat', 'out/anchor1','out/anchor2', 'out/anchor3','out/anchor4']) # this is the part where I dont know how to get person bbox # and action label for those person fro actionOut for detection in actionOut.reshape(-1, 3): print('sitting ' +str( detection)) print('standing ' +str(detection)) print('raising hand ' +str(detection))
Now, I don't know how to get bbox and action label from the output variable(actionOut). I am unable to find any documentation or blog explaining this.
Does someone have any idea or suggestion, how it can be done?
You can refer to this demo code to get the concept of how to dealing with the output of this model.