- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all, I've been playing with the examples in the zoo, and have a question about image aspect ratio, how do I ensure maximum accuracy? For example, the ssd mobilenet app will resize images to 300x300 which is a square ratio:
# Neural network assumes input images are these dimensions.
SSDMN_NETWORK_IMAGE_WIDTH = 300
SSDMN_NETWORK_IMAGE_HEIGHT = 300
But, the video that it downloads and uses as examples is all 960x540 which is 16:9 ratio, for example:
https://raw.githubusercontent.com/nealvis/media/master/traffic_vid/bus_station_6094_960x540.mp4
The code uses OpenCV's resize filter. Now, let's compare the results of resizing vs. cropping then resizing. First, this is what resizing that video from 16:9 to 1:1 looks like:
This is what it looks like when you crop first and then resize:
These are clearly quite different, and my question is, how does this affect accuracy? Should I rather crop my video to 1:1 or do the example models all expect that video is going to be 16:9 squashed into 1:1 ratio? To the eye, it would seem more logical that the model would perform better on image files that are not distorted by resizing to a different ratio.
Thanks in advance.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was thinking about this too recently. I think that the aspect ratio doesn't matter as long as it remains consistent. If the images you input to the network are always 16:9 then training on 16:9 images is fine. This is because the pre-processing before input to the network will always be the same 'squashedness', the detection you are trying to learn will always be distorted by the same amount.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes makes sense, I wonder then, how do you tell what aspect ratio the model was trained on? Specifically, which ones in the appzoo?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Doing a center crop and scale down to network input size will give the best results. We have run lots of experiments to validate this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Victor, yes agreed, I ran some experiments myself and came to the same conclusion that warping or distorting the image is a bad idea even if it can still work.
I think the examples in the ncappzoo should be changed to work on cropped video, otherwise it's just misleading for newcomers who might assume that 16:9 is the desirable ratio.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page