I wanted to fully understand how to tackle the requirement to align the frame size to values that are multiples of 16/32 (I am using only progressive frames so I will use the value 16 further on). I am encoding external RGB32 buffer to H264 using a pipeline of VPP and ENCODE.
1) Is this needeed for both VPP and ENCODE?
2) What actually happens when the frame size is not a multiple of 16? E.g. i have an external frame buffer that has a size that violates the "multiple of 16" rule. Do i need to actually pad each line to the rounded up value with empty pixels or I just need to set the right width and height, but the pitch can be set using the actual width? Or should the value be always rounded down to skip some pixels in the original buffer?
3) What happens in the region of the excess pixels in the output picture in case the value is rounded up? If an extra padding is added at the end of each pixel row do the excess pixels get their data from there?
Maybe I am getting something wrong or lacking some knowledge but at this point I'm quite unsure how to successfully tackle the "multiple of 16" issue.
The requirement for frame size to be a multiple of 16 comes from hardware design. The actual resolution used by VPP or the encoder internally will be a multiple of 16 with output cropped to the requested resolution. What happens on the edges if the input is not a multiple of 16 (height and width) will be unpredictable. You may see artifacts or noise. What I'm seeing in my experiments looks like what could generally be expected from uninitialized memory at the edges. Media SDK is robust, but there may still be cases where this could cause the GPU to hang. The application is expected to provide data meeting these resolution requirements for any Media SDK pipeline, but it is up to you to determine whether cropping or padding is most appropriate.
thanks a lot, I already was leaning towards cropping to solve this but you cleared my doubts. One more thing, when cropping is used, can the input buffer and pitch be as big as the cropped picture? E.g if i have a frame that does not have the required multiple of 16 size, can I set the width and height to aligned values, crop values to the actual, and provide a buffer that has a pitch of the cropped width and the SDK won't try to access memory outside the buffer?
It is up to the application to provide surfaces with conforming resolutions. However,you could potentially work with offsets, widths, etc. to skip the cropping step when copying to the VPP input surface. It sounds like you're very close. Good luck!
I am experiencing a similar problem which make absolutely no sense.
I have resolutions which are multiples of 16 and still if I don't pad the surface the frame is distorted.
For example resolution 1366x768@32 needs a padding of 40 bytes per line, since the depth is 4 bytes and 1366 rounded up to the nearest multiple of 16 is 1376 hence (1376 - 1366) * 4 = 40. So when I pad like that vpp works fine.
What I didn't expect is to do the same padding for the resolution 1360x768 as 1360 is a multiple of 16. So, if I don't do padding the image is distorted but if I pad 64 bytes (16*4) the image is displayed properly. I experienced the same with 1680x1050 which is also 16 aligned.
Is there a specific set of resolutions that vpp is expecting at its input or is there a sort of formula that shows how to crop or perform padding?
I haven't run into problems with any aligned sizes - could this be due to the restriction that memory itself also must be aligned? I am using the SDK in a 64bit application so I have this guarantee by default but I guess you could run into issues on 32bit applications. You could do a trivial assert to check the alignment of the pointers and see if that is what messes up your scenario.
Input width and the height needs to be aligned to 16 bit for Media SDK to process and height needs to be aligned to 32 if input is not progressive. One way this can be done is suggested in the tutorials is to use bitwise operator.
VPPParams.vpp.Out.Width = MSDK_ALIGN16(VPPParams.vpp.Out.CropW);
#define MSDK_ALIGN16(value) (((value + 15) >> 4) << 4)
However I don't see a problem with 1360x768 input, are you using samples or tutorials where this issue can be reproduced easily?