I'm trying to use the Media SDK 2012 to construct a low-latency display of an rtp stream. As a first cut, I have a DirectShow RTP filter that takes the RTP stream and converts it to an Annex B bytestream. I then connect this filter to the Intel H264 Decoder and then to EVR or VMR9. This all works, i.e. I can see the video screen in the EVR window. The problem I am having is that the latency on the H264 decode is ~1s. I've modified the decoder filter code to set AsyncDepth=1 and set MFX_BITSTREAM_COMPLETE_FRAME on the bitstream, with no apparent effect. The RTP filter is passing entire NALs to the encoder.
My development machine is a dual Xeon with an NVidia GPU, so it appears that the Media SDK is doing no hardware acceleration at all. Nevertheless, the latency is surprisingly high. Is this a reasonable latency number when decoding without hardware acceleration?
The stream I am decoding is 640 x 480 @15fps constrained to 1Mbit/s.
The DShow decoder filter already has support for low latency via the low latency preset. But if you prefer you can also make your own changes to the filter as you noted.
Based on your description I'm not clear on what may be the reason for the long latency. The bottleneck could be anywhere. I suggest inserting timer traces in the decoder filter to benchmark processing of each frame. This will tell you if the latency is due to decoder or other part of your pipeline.
Since you have a discrete graphics card installed HW acceleration is likely not enabled. If Xeon system has support for video acceleration and you are able to setup your system so that both graphics devices are active then there are ways of making HW acceleration via Intel processor work but it requires some changes to the decoder filter.
Regardless of HW acceleration. 1s is very large. Even using Media SDK SW decode the latency is quite low.
One more thing, it is important that the stream you are decoding has been encoded for low latency. For instance, this implies no B-frames and single DPB as detailed in the following white paper:
I put timer traces in the decoder filter, which is how I came up with my original latency numbers.
I don't have much control over the encoder, unfortunately, so I suspect that the stream either doesn't meet the requirements for a low-latency decode by the Intel decoder or I've corrupted the stream in the RTP filter. I have seen the same stream displayed with reasonably low latency on a Linux machine using gstreamer, so I think the stream itself is basically sound.
I tried a DirectShow graph with a camera, the Intel H264 encoder, the Intel decoder, and EVR, and it was blazingly fast.