Intel® Distribution of OpenVINO™ Toolkit
Community support and discussions about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all things computer vision-related on Intel® platforms.
6062 Discussions

The "gvaaudiodetect" element reports inference events that are lower than the threshold set

NikhilP
Beginner
152 Views

HI,

I am using DLStreamer 2021.4.X version, we are running an audio pipeline with aclnet model file which is set to a confidence threshold of 0.8: 

location=/home/ubuntu/Work/inputVideo/how_are_you_doing.wav ! decodebin ! audioresample ! audioconvert ! audio/x-raw, channels=1,format=S16LE,rate=16000 ! audiomixer output-buffer-duration=100000000 ! gvaaudiodetect model=/home/ubuntu/Work/public/aclnet/FP32/aclnet.xml model-proc=/home/ubuntu/Work/model_proc/aclnet.json threshold=0.8 sliding-window=0.2 ! gvametaconvert ! gvametapublish file-format=json-lines ! fakesink

I see that the inference events are reported but some are even below 0.8. (ones marked in bold in the Results). Is this a bug ? I expect only events that are beyond threshold of 0.8 to be reported.

Results is as below, input clip in zip format is attached

Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
{"channels":1,"events":[{"detection":{"confidence":0.7,"label":"Can opening","label_id":35,"segment":{"end_timestamp":1000000000,"start_timestamp":0}},"end_timestamp":1000000000,"event_type":"Can opening","start_timestamp":0}],"rate":16000}
{"channels":1,"events":[{"detection":{"confidence":0.67,"label":"Cow","label_id":4,"segment":{"end_timestamp":1200000000,"start_timestamp":200000000}},"end_timestamp":1200000000,"event_type":"Cow","start_timestamp":200000000}],"rate":16000}
{"channels":1,"events":[{"detection":{"confidence":0.99,"label":"Speech","label_id":53,"segment":{"end_timestamp":1400000000,"start_timestamp":400000000}},"end_timestamp":1400000000,"event_type":"Speech","start_timestamp":400000000}],"rate":16000}
{"channels":1,"events":[{"detection":{"confidence":1.0,"label":"Speech","label_id":53,"segment":{"end_timestamp":1600000000,"start_timestamp":600000000}},"end_timestamp":1600000000,"event_type":"Speech","start_timestamp":600000000}],"rate":16000}
{"channels":1,"events":[{"detection":{"confidence":1.0,"label":"Speech","label_id":53,"segment":{"end_timestamp":1800000000,"start_timestamp":800000000}},"end_timestamp":1800000000,"event_type":"Speech","start_timestamp":800000000}],"rate":16000}
{"channels":1,"events":[{"detection":{"confidence":1.0,"label":"Speech","label_id":53,"segment":{"end_timestamp":2000000000,"start_timestamp":1000000000}},"end_timestamp":2000000000,"event_type":"Speech","start_timestamp":1000000000}],"rate":16000}
{"channels":1,"events":[{"detection":{"confidence":1.0,"label":"Speech","label_id":53,"segment":{"end_timestamp":2200000000,"start_timestamp":1200000000}},"end_timestamp":2200000000,"event_type":"Speech","start_timestamp":1200000000}],"rate":16000}
{"channels":1,"events":[{"detection":{"confidence":1.0,"label":"Speech","label_id":53,"segment":{"end_timestamp":2400000000,"start_timestamp":1400000000}},"end_timestamp":2400000000,"event_type":"Speech","start_timestamp":1400000000}],"rate":16000}
{"channels":1,"events":[{"detection":{"confidence":1.0,"label":"Speech","label_id":53,"segment":{"end_timestamp":2600000000,"start_timestamp":1600000000}},"end_timestamp":2600000000,"event_type":"Speech","start_timestamp":1600000000}],"rate":16000}
{"channels":1,"events":[{"detection":{"confidence":1.0,"label":"Speech","label_id":53,"segment":{"end_timestamp":2800000000,"start_timestamp":1800000000}},"end_timestamp":2800000000,"event_type":"Speech","start_timestamp":1800000000}],"rate":16000}
{"channels":1,"events":[{"detection":{"confidence":1.0,"label":"Speech","label_id":53,"segment":{"end_timestamp":3000000000,"start_timestamp":2000000000}},"end_timestamp":3000000000,"event_type":"Speech","start_timestamp":2000000000}],"rate":16000}
{"channels":1,"events":[{"detection":{"confidence":0.99,"label":"Speech","label_id":53,"segment":{"end_timestamp":3200000000,"start_timestamp":2200000000}},"end_timestamp":3200000000,"event_type":"Speech","start_timestamp":2200000000}],"rate":16000}
{"channels":1,"events":[{"detection":{"confidence":0.99,"label":"Speech","label_id":53,"segment":{"end_timestamp":3400000000,"start_timestamp":2400000000}},"end_timestamp":3400000000,"event_type":"Speech","start_timestamp":2400000000}],"rate":16000}
Got EOS from element "pipeline0".
Execution ended after 0:00:00.051476798
Setting pipeline to NULL ...


  

0 Kudos
4 Replies
NikhilP
Beginner
151 Views

OK, so maybe I see the problem.

The model_proc file (JSON file) for the aclnet model has "threshold" for individual output labels and each of these are set to 0.5 so may be that is taking effect. In case you have a different theory do let me know.

Thank you,

Nikhil

Peh_Intel
Moderator
119 Views

Hi Nikhil,

 

Yes, you are right. The threshold defined in the model_proc file is taken effect.

 

For your information, the labels of the “output_postproc” in the model_proc must be strings or objects with index, label and threshold.

 

Only if the model-proc contains only array of labels, the threshold value that defined when launching gvaaudiodetect only takes effect.

 

Example:

"output_postproc": [

       {

           "layer_name": "output",

           "converter": "audio_labels",

           "labels": [

               "Dog",

               "Rooster",

               "Pig"

       ]

       }

       ]

 

I attach the model_proc (required to rename to JSON file) as well. The output results from these two different model_proc are the same. They are just different in the way of setting threshold value.

 

 

Regards,

Peh

 

NikhilP
Beginner
110 Views

Hi Peh,

 

The aclnet JSON attached by you gives control to the threshold setting of the "gvaaudiodetect", this is what I was looking for.

Thank you, this helps !


Regards,

Nikhil

Peh_Intel
Moderator
102 Views

Hi Nikhil,


I am glad that I was able to help.


This thread will no longer be monitored since this issue has been resolved. If you need any additional information from Intel, please submit a new question.



Regards,

Peh


Reply