Image aspect ratio - how does it affect accuracy?

idata · ‎08-08-2018

Hi all, I've been playing with the examples in the zoo, and have a question about image aspect ratio, how do I ensure maximum accuracy? For example, the ssd mobilenet app will resize images to 300x300 which is a square ratio:

# Neural network assumes input images are these dimensions.
SSDMN_NETWORK_IMAGE_WIDTH = 300
SSDMN_NETWORK_IMAGE_HEIGHT = 300

But, the video that it downloads and uses as examples is all 960x540 which is 16:9 ratio, for example:

https://raw.githubusercontent.com/nealvis/media/master/traffic_vid/bus_station_6094_960x540.mp4

The code uses OpenCV's resize filter. Now, let's compare the results of resizing vs. cropping then resizing. First, this is what resizing that video from 16:9 to 1:1 looks like:

This is what it looks like when you crop first and then resize:

These are clearly quite different, and my question is, how does this affect accuracy? Should I rather crop my video to 1:1 or do the example models all expect that video is going to be 16:9 squashed into 1:1 ratio? To the eye, it would seem more logical that the model would perform better on image files that are not distorted by resizing to a different ratio.

Thanks in advance.

idata · ‎08-08-2018

I was thinking about this too recently. I think that the aspect ratio doesn't matter as long as it remains consistent. If the images you input to the network are always 16:9 then training on 16:9 images is fine. This is because the pre-processing before input to the network will always be the same 'squashedness', the detection you are trying to learn will always be distorted by the same amount.

idata · ‎08-08-2018

Yes makes sense, I wonder then, how do you tell what aspect ratio the model was trained on? Specifically, which ones in the appzoo?

idata · ‎08-14-2018

Doing a center crop and scale down to network input size will give the best results. We have run lots of experiments to validate this.

idata · ‎08-16-2018

Thanks Victor, yes agreed, I ran some experiments myself and came to the same conclusion that warping or distorting the image is a bad idea even if it can still work.

I think the examples in the ncappzoo should be changed to work on cropped video, otherwise it's just misleading for newcomers who might assume that 16:9 is the desirable ratio.