Don't understand: 'Reshaping the model to the height and width of the input image'

wb666greene · ‎07-27-2024

I don't understand this section of the hello_reshape_ssd.py sample code installed by the apt install of 2024.2 on Ubuntu 22.04. The apt install is broken as is being discussed in another thread here, but I have a much weaker system I'd setup with a PIP install using a VENV virtual environment following the YOLO8 on GPU tutorial, so I tried this sample code there where it runs.

But in the code snippet below, shouldn't the image be resized to the model size instead?

    image = cv2.imread(image_path)
    # Add N dimension
    input_tensor = np.expand_dims(image, 0)

    log.info('Reshaping the model to the height and width of the input image')
    n, h, w, c = input_tensor.shape
    model.reshape({model.input().get_any_name(): ov.PartialShape((n, c, h, w))})

The MobilenetSSV_v2 model I converted when using OpenVINO 2021.3 and the "special" openvino opencv version dnn module images were sized to match the model 300x300.

This code works if I feed in a "small image" say 516x394 or 1920x1080 it seems to work and draws a box that matches what the 2021.3 dnn inference made. But if I feed in a lager image like 4K it breaks the person up into two non-overlapping boxes, if I feed in a cellphone camera image of 4096x3072 the function runs and detects but the boxes are poorly located and one even has a negative coordinate (-2).

I need to run the inference in a loop and the input images can be different sizes every time.

Iffa_Intel · ‎07-30-2024

Hi,

the Hello Reeshape SSD sample is a sample that demonstrates how to do synchronous inference of object detection models using Shape Inference feature.

Hence, that is why the model is being reshaped instead of the image

Cordially,

Iffa

mariajaeel · ‎08-08-2024

To implement this, you could use cv2.resize to resize each image to 300x300 (or whatever your model expects) before the reshaping and inference steps. Here's a brief modification to your code:

image = cv2.imread(image_path)
image_resized = cv2.resize(image, (300, 300)) # Resize to model's expected input size
input_tensor = np.expand_dims(image_resized, 0)

log.info('Reshaping the model to the height and width of the input image')
n, h, w, c = input_tensor.shape
model.reshape({model.input().get_any_name(): ov.PartialShape((n, c, h, w))})

Iffa_Intel · ‎08-15-2024

Hi,

Intel will no longer monitor this thread since we have provided a solution. If you need any additional information from Intel, please submit a new question.

Cordially,

Iffa

mariajaeel · ‎09-22-2024

I just tried to post the solution but thanks for informing.

Peeterjackson · ‎11-08-2024

For anyone working with the hello_reshape_ssd .py sample code and encountering issues with image resizing, especially with varied image sizes (like in this Height Chart), the key is to match the input image size to the model’s expected dimensions. This will help improve detection accuracy and prevent bounding box misalignments on larger images.

If you’re curious about how different image sizes affect model accuracy, check out this helpful Height chart for visual reference on optimal input sizes. This can be especially useful if you're resizing high-resolution images in a loop and aiming for consistent detection performance.