landmarks-regression-retail-0009 vs retinaface-resnet50-pytorch - Same input Different output

anuragrawal · ‎07-03-2024

Hi Team,

I am working on a face recognition pipeline, similar to the demo here: https://docs.openvino.ai/2024/omz_demos_face_recognition_demo_python.html

I want to use retinaface-resnet50-pytorch model for face detection, instead of 2 models supported in the demo. (https://docs.openvino.ai/2024/omz_models_model_retinaface_resnet50_pytorch.html)

Since RetinaFace also provides face landmarks, I don't want to use the landmarks regression model used in the demo. (https://docs.openvino.ai/2024/omz_models_model_landmarks_regression_retail_0009.html#landmarks-regression-retail-0009)

When I make these replacements, I see totally different landmarks. For the same input image, this is what I see:

retinaface-resnet50-pytorch: [(0.5292039930820465, 0.6474253833293915),
(0.564214069582522, 0.6451758056879043),
(0.5428659155964851, 0.6772208377718926),
(0.5381209135055542, 0.7076690897345543),
(0.5617115259170532, 0.7057589292526245)]

landmarks-regression-retail-0009: [[0.21510038, 0.35669631],
[0.62190449, 0.36558977],
[0.34216839, 0.57063818],
[0.26720583, 0.77790624],
[0.60777581, 0.8013128 ]]

My questions are:

1) retinaface openvino page clarifies that the order of landmarks is left eye, right eye, nose, left mouth corner, and right mouth corner but landmarks-regression-retail-0009 doesn't. Is the order of landmarks in the output same for these two models?

2) Would you know why these two models predict significantly different outputs? I would understand slight difference in the output since they are different models but they are on the same input image so both outputs cannot be accurate.

Let me know if you need anything additional. Thank you!

Iffa_Intel · ‎07-05-2024

Hi,

to validate this, could you provide:

The relevant model files
Steps & commands that you used in conversion/inferencing
OpenVINO sample application that you use/ inferencing code
I believe you are using Intel pre-trained model, did you implement any modifications to the model?

Cordially,

Iffa

anuragrawal · ‎07-05-2024

Hi Iffa,

Thanks for your response! Here are the details:

The relevant model files - present in the google drive link.
Steps & commands that you used in conversion/inferencing - For model download and conversion, I just cloned the following repos and did omz_downloader --list models.lst, followed by omz_converter --list models.lst:
1. https://github.com/openvinotoolkit/open_model_zoo/tree/master/demos/face_recognition_demo/python
2. https://github.com/openvinotoolkit/open_model_zoo/tree/master/demos/object_detection_demo/python
OpenVINO sample application that you use/ inferencing code - present in the google drive link.
I believe you are using Intel pre-trained model, did you implement any modifications to the model? - No, I am using Intel pre-trained models without any modifications.

Everything you need to reproduce this issue is here: https://drive.google.com/file/d/1RBs37zlMevG0uHS3SAOE4NcAsM4VPMpn/view?usp=sharing

Steps to reproduce the issue:

1) Create a python virtual environment and install following libraries:
pip install opencv-python
pip install openvino
pip install scipy

2) git clone https://github.com/openvinotoolkit/open_model_zoo.git
3) Extract "landmarks_comparison_demo.zip" ( present in the google drive link above) and place the extracted directory in the open_model_zoo/demos directory

4) cd to the extracted directory "landmarks_comparison_demo" and run the python script using the following command:
python3 retinaface_vs_landmarks-regression.py -i frame849.jpg

Let me know if you need anything additional.

Thank you,

Anurag

Iffa_Intel · ‎07-08-2024

Hi,

thanks for the details.

The 2 models that you are comparing have different architectures:

The Intel pre-trained model landmarks-regression-retail-0009 is using CNN architecture with PReLU activation function.

Meanwhile, retinaface-resnet50-pytorch is a PyTorch model with ResNet50 backbone.

They definitely would produce different outputs due to differences in their architectures.

In order to exactly know why they differ, you need to dive into their respective layers and investigate their functions.

This scope is focused more on the development part of the neural network instead of OpenVINO, since OV is majorly involved in inferencing part which uses an already developed neural network model.

Cordially,

Iffa

anuragrawal · ‎07-09-2024

Thanks Iffa!

Iffa_Intel · ‎07-10-2024

Hi,

Glad that helps.

If you don't have any further inquiries, shall we proceed in closing this thread?

Cordially,

Iffa

Iffa_Intel · ‎07-18-2024

Hi,

Intel will no longer monitor this thread since this issue has been resolved. If you need any additional information from Intel, please submit a new question.

Cordially,

Iffa