Re:Confidence score on custom classify model

hungtrieu07 · ‎04-02-2025

In the previous thread, I have successfully display the output of person attribute recognition model using DLStreamer pipeline. But when debug, I saw the confidence score of that model is only 1 value, but I need each value of each attribute, for more post-processing on gvapython element.

This is my current pipeline:

GST_DEBUG=4 gst-launch-1.0 filesrc location=../video_data/1740731094984.mp4 ! decodebin ! videoconvert ! \

gvadetect model=./ov_exported_model/person_vehicle/FP16.xml device=CPU ! queue ! \

gvaclassify model=./onnx_models/openvino_model/model.xml model-proc=./onnx_models/openvino_model/model-proc.json object-class="person" device=CPU ! queue ! \

gvapython module=./process_attributes.py ! queue ! \

gvawatermark ! videoconvert ! autovideosink sync=false

Wan_Intel · ‎04-03-2025

Hi hungtrieu07,

Thank you for reaching out to us.

Does the issue occur when you use GST_DEBUG command? On another note, could you please share the screenshot of the result with us?

Regards,

Wan

hungtrieu07 · ‎04-03-2025

Hi @Wan_Intel,

I think that is default output when I used "compound" as method. This is my model-proc.json file:

`

{

"json_schema_version": "2.2.0",

"input_preproc": [

{

"params": {

"resize": "aspect-ratio",

"range": [

0.0,

1.0

]

}

],

"output_postproc": [

{

"layer_name": "output",

"attribute_name": "person-attributes",

"labels": [

"accessoryHat",

"",

"hairLong",

"",

"hairShort",

"",

"upperBodyShortSleeve",

"",

"upperBodyBlack",

"",

"upperBodyBlue",

"",

"upperBodyBrown",

"",

"upperBodyGreen",

"",

"upperBodyGrey",

"",

"upperBodyOrange",

"",

"upperBodyPink",

"",

"upperBodyPurple",

"",

"upperBodyRed",

"",

"upperBodyWhite",

"",

"upperBodyYellow",

"",

"upperBodyLongSleeve",

"",

"lowerBodyShorts",

"",

"lowerBodyShortSkirt",

"",

"lowerBodyBlack",

"",

"lowerBodyBlue",

"",

"lowerBodyBrown",

"",

"lowerBodyGreen",

"",

"lowerBodyGrey",

"",

"lowerBodyOrange",

"",

"lowerBodyPink",

"",

"lowerBodyPurple",

"",

"lowerBodyRed",

"",

"lowerBodyWhite",

"",

"lowerBodyYellow",

"",

"lowerBodyLongSkirt",

"",

"footwearLeatherShoes",

"",

"footwearSandals",

"",

"footwearShoes",

"",

"footwearSneaker",

"",

"carryingBackpack",

"",

"carryingMessengerBag",

"",

"carryingLuggageCase",

"",

"carryingSuitcase",

"",

"personalLess30",

"",

"personalLess45",

"",

"personalLess60",

"",

"personalLarger60",

"",

"personalLess15",

"",

"personalMale",

"",

"personalFemale",

""

],

"converter": "label",

"method": "compound"

}

]

}

`

This is screen output:

Screenshot from 2025-04-03 16-42-22.png

Wan_Intel · ‎04-06-2025

Hi hungtrieu07,

Thank you for sharing the model-proc.json file with us.

I have replicated the issue with the model-proc.json file and used the command that you shared above, but I removed the gvapython module since I do not have the file, and I encountered similar issues as you did.

We will further investigate the issue, and we will provide an update here as soon as possible.

Regards,

Wan

Witold_Intel · ‎04-10-2025

Hello Trieu,

Based on your pipeline and model-proc.json, I understand you're seeing only one confidence score for all attributes when you need individual scores for each attribute. This is happening because you're using the "compound" method in your output post-processing configuration.

The "compound" method in gvaclassify combines all attributes into a single classification result with one confidence score. This isn't what you want for attribute recognition where you need separate confidence scores for each attribute. This can be solved by changing the method from "compound" to "multi-label" in your model-proc.json.

For example:

{

"json_schema_version": "2.2.0",

"input_preproc": [

{

"params": {

"resize": "aspect-ratio",

"range": [

0.0,

1.0

]

}

],

"output_postproc": [

{

"layer_name": "output",

"attribute_name": "person-attributes",

"labels": [

"accessoryHat",

"hairLong",

"hairShort",

"upperBodyShortSleeve",

"upperBodyBlack",

"upperBodyBlue",

"upperBodyBrown",

"upperBodyGreen",

"upperBodyGrey",

"upperBodyOrange",

"upperBodyPink",

"upperBodyPurple",

"upperBodyRed",

"upperBodyWhite",

"upperBodyYellow",

"upperBodyLongSleeve",

"lowerBodyShorts",

"lowerBodyShortSkirt",

"lowerBodyBlack",

"lowerBodyBlue",

"lowerBodyBrown",

"lowerBodyGreen",

"lowerBodyGrey",

"lowerBodyOrange",

"lowerBodyPink",

"lowerBodyPurple",

"lowerBodyRed",

"lowerBodyWhite",

"lowerBodyYellow",

"lowerBodyLongSkirt",

"footwearLeatherShoes",

"footwearSandals",

"footwearShoes",

"footwearSneaker",

"carryingBackpack",

"carryingMessengerBag",

"carryingLuggageCase",

"carryingSuitcase",

"personalLess30",

"personalLess45",

"personalLess60",

"personalLarger60",

"personalLess15",

"personalMale",

"personalFemale"

],

"converter": "label",

"method": "multi-label"

}

]

}

Individual values can be later processed in gvapython like this:

def process_frame(frame):

for roi in frame.regions():

for obj in roi.objects():

if obj.label() == "person":

for attribute in obj.attributes():

print(f"Attribute: {attribute.name()}, Confidence: {attribute.confidence()}")

# Add your custom post-processing here

Would this be an acceptable solution for you?

hungtrieu07 · ‎04-10-2025

Hi @Witold_Intel, thanks for the reply. After I changed the "method" from "compound" to "multi-label", I received output in this screenshot.

As you can see, the output now only have 1 output class. I'm using this pipeline for test the model, the model-proc-2.json file is the model-proc example you write above:

GST_DEBUG=4 gst-launch-1.0 filesrc location=../video_data/04042025.mp4 ! decodebin ! videoconvert ! \

gvadetect model=./ov_exported_model/person_vehicle/FP32.xml device=CPU ! queue ! \

gvaclassify model=./onnx_models/openvino_model/model.xml model-proc=./onnx_models/openvino_model/model-proc-2.json object-class="person" device=CPU ! queue ! \

gvametaconvert format=json ! gvawatermark ! videoconvert ! autovideosink sync=false

This is a sample of model output from the debug terminal:

0:00:18.617587372 52779 0x79a3bc000b70 INFO jsonconverter jsonconverter.cpp:347:to_json:<gvametaconvert0> JSON message: {"objects":[{"detection":{"bounding_box":{"x_max":0.9991917759059632,"x_min":0.9120402472025191,"y_max":0.8080614210376353,"y_min":0.5386163791807945},"confidence":0.6215878129005432,"label":"person","label_id":3},"h":291,"person-attributes":{"confidence":0.9990977048873901,"label":"hairShort","label_id":2,"model":{"name":"main_graph"}},"region_id":824,"roi_type":"person","w":167,"x":1751,"y":582}],"resolution":{"height":1080,"width":1920},"timestamp":13800000000}

Witold_Intel · ‎04-11-2025

Hello Huang,

Many thanks for sharing the outcome. Could you please share the model-proc-2.json so I can check it? Alternatively, you can run a benchmark on the model quickly to check raw outputs (there should be 45 floats for 45 attributes).

benchmark_app -m model.xml -i input_image.jpg -d CPU -api sync

hungtrieu07 · ‎04-11-2025

Hi @Witold_Intel, this is the model-proc-2.json file: https://drive.google.com/file/d/1jiInFPD8x4rjKsNb1QS4kULrC55VCew9/view?usp=sharing

I'm also run the benchmark tool, this is the output from terminal:

(person-attr) (base) hungtrieu07@potato:~/dev/pedestrian-attribute-recognition-pytorch$ benchmark_app -m onnx_models/openvino_model/model.xml -i test_person_attr.png -d CPU -api sync
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2025.0.0-49-2b71586676c
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2025.0.0-49-2b71586676c
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(CPU) performance hint will be set to PerformanceMode.LATENCY.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 21.39 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ] input (node: input) : f32 / [...] / [1,3,224,224]
[ INFO ] Model outputs:
[ INFO ] output (node: output) : f32 / [...] / [1,45]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ] input (node: input) : u8 / [N,C,H,W] / [1,3,224,224]
[ INFO ] Model outputs:
[ INFO ] output (node: output) : f32 / [...] / [1,45]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 189.58 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ] NETWORK_NAME: main_graph
[ INFO ] OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ] NUM_STREAMS: 1
[ INFO ] INFERENCE_NUM_THREADS: 8
[ INFO ] PERF_COUNT: NO
[ INFO ] INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ] PERFORMANCE_HINT: LATENCY
[ INFO ] EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] ENABLE_CPU_PINNING: True
[ INFO ] SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ] MODEL_DISTRIBUTION_POLICY: set()
[ INFO ] ENABLE_HYPER_THREADING: False
[ INFO ] EXECUTION_DEVICES: ['CPU']
[ INFO ] CPU_DENORMALS_OPTIMIZATION: False
[ INFO ] LOG_LEVEL: Level.NO
[ INFO ] CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0
[ INFO ] DYNAMIC_QUANTIZATION_GROUP_SIZE: 32
[ INFO ] KV_CACHE_PRECISION: <Type: 'uint8_t'>
[ INFO ] KEY_CACHE_PRECISION: <Type: 'uint8_t'>
[ INFO ] VALUE_CACHE_PRECISION: <Type: 'uint8_t'>
[ INFO ] KEY_CACHE_GROUP_SIZE: 0
[ INFO ] VALUE_CACHE_GROUP_SIZE: 0
[Step 9/11] Creating infer requests and preparing input tensors
[ INFO ] Prepare image /home/hungtrieu07/dev/pedestrian-attribute-recognition-pytorch/test_person_attr.png
[ WARNING ] Image is resized from ((539, 250)) to ((224, 224))
[Step 10/11] Measuring performance (Start inference synchronously, limits: 60000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 40.10 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count: 3819 iterations
[ INFO ] Duration: 60006.01 ms
[ INFO ] Latency:
[ INFO ] Median: 14.93 ms
[ INFO ] Average: 15.63 ms
[ INFO ] Min: 14.04 ms
[ INFO ] Max: 31.55 ms
[ INFO ] Throughput: 63.64 FPS

Witold_Intel · ‎04-14-2025

Hello Hung,

I checked the json and noticed two issues:

The output shows only one attribute (e.g., "hairShort") despite using "multi-label" method.

Using "converter": "label" selects the top result, rather than outputting all attributes.

Here's my suggestion to correct it:

{

"json_schema_version": "2.2.0",

"input_preproc": [

{

"params": {

"resize": "aspect-ratio",

"range": [0.0, 1.0]

}

],

"output_postproc": [

{

"layer_name": "output",

"attribute_name": "person-attributes",

"labels": [

"accessoryHat",

"hairLong",

"hairShort",

"upperBodyShortSleeve",

"upperBodyBlack",

"upperBodyBlue",

"upperBodyBrown",

"upperBodyGreen",

"upperBodyGrey",

"upperBodyOrange",

"upperBodyPink",

"upperBodyPurple",

"upperBodyRed",

"upperBodyWhite",

"upperBodyYellow",

"upperBodyLongSleeve",

"lowerBodyShorts",

"lowerBodyShortSkirt",

"lowerBodyBlack",

"lowerBodyBlue",

"lowerBodyBrown",

"lowerBodyGreen",

"lowerBodyGrey",

"lowerBodyOrange",

"lowerBodyPink",

"lowerBodyPurple",

"lowerBodyRed",

"lowerBodyWhite",

"lowerBodyYellow",

"lowerBodyLongSkirt",

"footwearLeatherShoes",

"footwearSandals",

"footwearShoes",

"footwearSneaker",

"carryingBackpack",

"carryingMessengerBag",

"carryingLuggageCase",

"carryingSuitcase",

"personalLess30",

"personalLess45",

"personalLess60",

"personalLarger60",

"personalLess15",

"personalMale",

"personalFemale"

],

"converter": "multi_label",

"method": "multi-label",

"threshold": 0.5,

"activation": "sigmoid"

}

]

}

Witold_Intel · ‎04-16-2025

Hello Hung,

Have you been able to read my previous post? Please respond within 3 business days, otherwise I will have to deescalate this issue.

hungtrieu07 · ‎04-16-2025

Hi Witold, I’m having problem with browser, so I will post my problem on my phone. I tried your solution but it returned with error: Unsupported converter: multi_label

This is the log from terminal: https://drive.google.com/file/d/1KdFCw_YcEtZDpRElILWmPQ7oUwkjx4s7/view?usp=sharing

Because of something, I can't post the log on this reply.

Witold_Intel · ‎04-17-2025

Hi Hung,

What's your version of DL Streamer? The newer ones should support the multilabel converter.

You can upgrade through the command:

sudo apt-get update && sudo apt-get install intel-dlstreamer

hungtrieu07 · ‎04-17-2025

Hi Witold,

I'm using the latest DLStreamer Docker image,, run with this 2 commands:

xhost +local:docker
docker run -it --user root -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix --rm \
-v /home/hungtrieu07/dev/zvms_models:/home/dlstreamer/dlstreamer/zvms_models \
-v /home/hungtrieu07/dev/video_data:/home/dlstreamer/dlstreamer/video_data \
intel/dlstreamer:latest

Witold_Intel · ‎04-18-2025

Hi Hung,

It's a little strange multi-label should work with that version of DL Streamer. An alternative could be to use "converter": "tensor_to_attr_meta".

hungtrieu07 · ‎04-21-2025

Hi Witold,

I'm using "converter": "tensor_to_attr_meta" and it still returned error about unsupported converter. This is the log:

Setting pipeline to PAUSED ...
0:00:00.027776719 2632 0x58bd1fc3db50 WARN xcontext xvcontext.c:546:gst_xvcontext_check_xshm_calls: MIT-SHM extension check failed at XShmAttach. Not using shared memory.
Pipeline is PREROLLING ...
0:00:00.036074797 2632 0x7c582c092000 WARN qtdemux qtdemux.c:554:gst_qtdemux_pull_atom:<qtdemux0> atom has bogus size 1675843197
Redistribute latency...
Redistribute latency...
Redistribute latency...
0:00:00.092254907 2632 0x7c582c092860 WARN GVA_common openvino_image_inference.cpp:1131:configure_image_input: Layout for 'x' input is not explicitly set, so it's defaulted to [?,?,H,W]
0:00:00.328821240 2632 0x7c582c092860 WARN GVA_common openvino_image_inference.cpp:1131:configure_image_input: Layout for 'input' input is not explicitly set, so it's defaulted to [?,?,H,W]
0:00:00.482218718 2632 0x7c582c092860 ERROR GVA_common post_processor_impl.cpp:120:PostProcessorImpl: Post-processing error: Unsupported converter: tensor_to_attr_meta
0:00:00.482245134 2632 0x7c582c092860 ERROR GVA_common post_processor_c.cpp:22:createPostProcessor: Couldn't create post-processor:
Failed to create PostProcessorImpl
Unsupported converter: tensor_to_attr_meta
0:00:00.482285951 2632 0x7c582c092860 WARN gva_base_inference gva_base_inference.cpp:805:gva_base_inference_set_caps:<gvaclassify0> error: base_inference based element initialization has been failed.
0:00:00.482289589 2632 0x7c582c092860 WARN gva_base_inference gva_base_inference.cpp:805:gva_base_inference_set_caps:<gvaclassify0> error:
post-processing is NULL.
ERROR: from element /GstPipeline:pipeline0/GstGvaClassify:gvaclassify0: base_inference based element initialization has been failed.
Additional debug info:
/opt/intel/dlstreamer/src/monolithic/gst/inference_elements/base/gva_base_inference.cpp(805): gva_base_inference_set_caps (): /GstPipeline:pipeline0/GstGvaClassify:gvaclassify0:

post-processing is NULL.
ERROR: pipeline doesn't want to preroll.
Setting pipeline to NULL ...
0:00:00.482370728 2632 0x7c582c092860 WARN basetransform gstbasetransform.c:1379:gst_base_transform_setcaps:<gvaclassify0> FAILED to configure incaps video/x-raw, format=(string)I420, width=(int)1920, height=(int)1080, interlace-mode=(string)progressive, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)mpeg2, framerate=(fraction)15/1 and outcaps video/x-raw, format=(string)I420, width=(int)1920, height=(int)1080, interlace-mode=(string)progressive, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)mpeg2, framerate=(fraction)15/1
0:00:00.822543256 2632 0x7c580c0f92c0 WARN GVA_common inference_impl.cpp:896:PushBufferToSrcPad: Inference gst_pad_push returned status: -2
0:00:00.833077333 2632 0x7c580c0f92c0 WARN GVA_common inference_impl.cpp:896:PushBufferToSrcPad: Inference gst_pad_push returned status: -2
0:00:00.833116931 2632 0x7c580c0f92c0 WARN GVA_common inference_impl.cpp:896:PushBufferToSrcPad: Inference gst_pad_push returned status: -2
0:00:00.834551975 2632 0x7c580c0f92c0 WARN GVA_common inference_impl.cpp:896:PushBufferToSrcPad: Inference gst_pad_push returned status: -2
0:00:00.996459395 2632 0x7c580c0f92c0 WARN GVA_common inference_impl.cpp:896:PushBufferToSrcPad: Inference gst_pad_push returned status: -2
Freeing pipeline ...

I'm using Docker image with name "intel/dlstreamer:2025.0.1.3-ubuntu22".

hungtrieu07 · ‎04-21-2025

Hi Witold,

I'm using "converter": "tensor_to_attr_meta" and it still returned error about unsupported converter. This is the log:

Setting pipeline to PAUSED ...
0:00:00.027776719 2632 0x58bd1fc3db50 WARN xcontext xvcontext.c:546:gst_xvcontext_check_xshm_calls: MIT-SHM extension check failed at XShmAttach. Not using shared memory.
Pipeline is PREROLLING ...
0:00:00.036074797 2632 0x7c582c092000 WARN qtdemux qtdemux.c:554:gst_qtdemux_pull_atom:<qtdemux0> atom has bogus size 1675843197
Redistribute latency...
Redistribute latency...
Redistribute latency...
0:00:00.092254907 2632 0x7c582c092860 WARN GVA_common openvino_image_inference.cpp:1131:configure_image_input: Layout for 'x' input is not explicitly set, so it's defaulted to [?,?,H,W]
0:00:00.328821240 2632 0x7c582c092860 WARN GVA_common openvino_image_inference.cpp:1131:configure_image_input: Layout for 'input' input is not explicitly set, so it's defaulted to [?,?,H,W]
0:00:00.482218718 2632 0x7c582c092860 ERROR GVA_common post_processor_impl.cpp:120:PostProcessorImpl: Post-processing error: Unsupported converter: tensor_to_attr_meta
0:00:00.482245134 2632 0x7c582c092860 ERROR GVA_common post_processor_c.cpp:22:createPostProcessor: Couldn't create post-processor:
Failed to create PostProcessorImpl
Unsupported converter: tensor_to_attr_meta
0:00:00.482285951 2632 0x7c582c092860 WARN gva_base_inference gva_base_inference.cpp:805:gva_base_inference_set_caps:<gvaclassify0> error: base_inference based element initialization has been failed.
0:00:00.482289589 2632 0x7c582c092860 WARN gva_base_inference gva_base_inference.cpp:805:gva_base_inference_set_caps:<gvaclassify0> error:
post-processing is NULL.
ERROR: from element /GstPipeline:pipeline0/GstGvaClassify:gvaclassify0: base_inference based element initialization has been failed.
Additional debug info:
/opt/intel/dlstreamer/src/monolithic/gst/inference_elements/base/gva_base_inference.cpp(805): gva_base_inference_set_caps (): /GstPipeline:pipeline0/GstGvaClassify:gvaclassify0:

post-processing is NULL.
ERROR: pipeline doesn't want to preroll.
Setting pipeline to NULL ...
0:00:00.482370728 2632 0x7c582c092860 WARN basetransform gstbasetransform.c:1379:gst_base_transform_setcaps:<gvaclassify0> FAILED to configure incaps video/x-raw, format=(string)I420, width=(int)1920, height=(int)1080, interlace-mode=(string)progressive, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)mpeg2, framerate=(fraction)15/1 and outcaps video/x-raw, format=(string)I420, width=(int)1920, height=(int)1080, interlace-mode=(string)progressive, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)mpeg2, framerate=(fraction)15/1
0:00:00.822543256 2632 0x7c580c0f92c0 WARN GVA_common inference_impl.cpp:896:PushBufferToSrcPad: Inference gst_pad_push returned status: -2
0:00:00.833077333 2632 0x7c580c0f92c0 WARN GVA_common inference_impl.cpp:896:PushBufferToSrcPad: Inference gst_pad_push returned status: -2
0:00:00.833116931 2632 0x7c580c0f92c0 WARN GVA_common inference_impl.cpp:896:PushBufferToSrcPad: Inference gst_pad_push returned status: -2
0:00:00.834551975 2632 0x7c580c0f92c0 WARN GVA_common inference_impl.cpp:896:PushBufferToSrcPad: Inference gst_pad_push returned status: -2
0:00:00.996459395 2632 0x7c580c0f92c0 WARN GVA_common inference_impl.cpp:896:PushBufferToSrcPad: Inference gst_pad_push returned status: -2
Freeing pipeline ...

I'm using Docker image with name "intel/dlstreamer:2025.0.1.3-ubuntu22".

hungtrieu07 · ‎04-21-2025

Hi Witold, I'm used "converter": "tensor_to_attr_meta" but it still returned error "Unsupported converter". I also deleted the DLStreamer Docker image and pull a new image named "intel/dlstreamer:2025.0.1.3-ubuntu22" and test it, still that error.

If "gvaclassify" element have some problems, can you guide me on the "gvainference" element?