Skylake HW HEVC decoding is, unfortunately, not resilient to data loss (for the Main profile, at least). With hardware H.264 decoding, since at least 3rd generation Intel Core processors, in case of data loss during decoding (say, due to packet loss in a networking application), at least for the MFX_PROFILE_AVC_HIGH profile, video will usually become slightly garbled, but it is able to recover from this once the data loss stops occurring. With HEVC, however, once data loss has been encountered, the video becomes garbled and stays garbled. This is even true if no data loss ever occurs again after the initial data loss, and it is even true when using VBR and changing the bit rate _after_ the data loss, which one might expect would reset the garbled state. Oftentimes, upon data loss, the decoder outputs a green color that saturates each frame following the data loss--this looks horrible.
Hopefully, this is something that can be addressed in future versions of the drivers and Media SDK.
Hi Aaron, Can you send the modified sample application and the test stream for your experiment? If this is a bug/gap (it may well be), we want to ensure it is reported. Sometimes it requires the decoder to be non-conformant to recover from encoder errors - so want to understand what your case is.
I don't have a modified sample application or a test stream. This is something that I can easily demonstrate using my own software, but it would be a considerable amount of work to create a sample that demonstrates this situation. Also, this has to do with the decoder, not the encoder. Here is perhaps a better example for the workflow:
-- Encode a 720p YV12 or NV12 input file using sample_encode with HEVC. Use 1-2.5 Mbps as the bit rate, use VBR (although I don't think that matters), and pick a target usage of 3 or 4.
-- Modify sample_decode to skip over some of the chunks of the encoded HEVC file while decoding (that is, after reading these chunks from the file, simply discard them). In my software, I use chunks of size 1370 bytes and I skipped over chunks 1000, 1010, 1015, 1020, 1030, 1050, 1100, 1103, and 1110 (0-based indexing). Since I'm dealing with live video, the results differ slightly each time that I run this, but the video always becomes garbled in some fashion, and the areas of the frame that are garbled never return to normal. The green tint frequently occurs but not always.
Somehow view the raw output file from sample_decode. In my software, I convert from NV12 to UYVY using IPP and output via a Blackmagic DeckLink card, so I can immediately see the results. Alternatively, re-encode the raw file using sample_encode (as H.264 or HEVC, doesn't matter), mux it into a container (perhaps using MKVToolNix), and use video playback software to see the results, although this approach is less desirable than viewing the raw video directly.
If Intel is interested in making HW HEVC decoding more resilient to data loss, then it probably makes sense for Intel to create a sample for properly testing this sort of thing. Perhaps the legacy video conferencing sample can be resurrected for this purpose.
Hi Aaron, thanks for your feedback on Skylake HEVC hardware decode. I'm setting up a reproducer now. Should be able to let you know more about results and next steps tomorrow.
Hi Aaron. Apologies for getting back a day late. I can reproduce at least part of what you're seeing with HEVC.
Expected behavior: If corruption affects frames used as references, output may be garbled until the next IDR frame. (This is what I'm seeing on my system. Not saying I don't believe errors can persist beyond IDR boundaries, I just don't have a reproducer for this yet.)
I did find a potential bug: the data.Corrupted flag should indicate when the decoder encounters parsing problems. It does for H264, but not for HEVC. This is one of the first steps toward better robustness.
One strategy to increase robustness for h264/mpeg2 is to reset/restart the decoder if surface.data.Corrupted!=0. As a temporary workaround you could reset after a constant number of frames. Of course the best answer is to make this flag operational as soon as possible.
Resilience is important -- thanks for your help making our decoder more resilient. I also agree with you on extending the samples. I added a randomly corrupting reader to CSmplBitstreamReader in sample_utils.cpp for this investigation. No promises or timelines, but I'll certainly advocate for extending the samples/tutorials in this direction in the future.
You indicated that you introduced data corruption into the sample code in order to test this. Depending on how you implemented that, it may not be quite the same as what I described. The situation that I described involves missing data, not corrupted data. So, one or more chunks of the encoded stream are lost for some reason. In the situation that I described, I forced 9 chunks of size 1370 bytes to be lost, and each was fairly close to the next in terms of its index if the encoded video stream is broken up into an array of chunks of size 1370 bytes. When I encounter a missing chunk, my current approach is to simply disregard the missing chunk. So, for example, if I forced chunk 1000 to be lost, immediately after passing chunk 999 to the decoder, I would then pass chunk 1001 to the decoder. Another approach is to duplicate the contents of the previous chunk (so, in this example, after passing chunk 999 to the decoder, pass chunk 999 to the decoder again, in lieu of the missing chunk 1000, and then move on to chunk 1001), but I don't think that either approach is necessarily any better than the other.
Regarding your suggestion to use MFXVideoDecode_Reset() after a constant number of frames--I would think the use of Reset() could result in data loss in and of itself. Its not entirely clear from the Media SDK documentation, but I suspect that after calling Reset(), any encoded data that may have already been passed to the decoder but hasn't been decoded yet will be lost. It would seem that this could occur even if I drain the decoder before calling Reset(), since I may have passed to the decoder an incomplete encoded frame. In addition, according to the Media SDK Developer's Guide, after calling Reset(), it is necessary to:
Besides the fact that these instructions are written from the perspective of H.264, even following these instructions for H.264 is not such a simple matter. Doing so requires some understanding of the encoded video stream, and one of the reasons to use the Intel Media SDK for encoding and decoding is so that the developer doesn't have to understand the internals of the various codecs. Plus, I never needed to call Reset() when I encountered data loss for an H.264 encoded stream.