Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Crash when fast forwarding

fre3de
Beginner
554 Views
Hello!

I have built a MPEG-4 decoder based on IPP 5.3.
It's working just fine except when I am fast forwarding a recording in high speeds (more than 2x).
I located the problem to mp4_ShowBits in mp4Stream.c:

Ipp32u mp4_ShowBits(mp4_Info* pInfo, Ipp32s n)
{
Ipp8u* ptr = pInfo->bufptr;
Ipp32u tmp = (ptr[0] << 24) | (ptr[1] << 16) | (ptr[2] << 8) | (ptr[3]);
tmp <<= pInfo->bitoff;
tmp >>= 32 - n;
return tmp;
}

The problem was that ptr[1], 2 and 3 sometimes pointed outside the buffer.
Weird...
However I implemented a check for this but now when I get a crash, all I see is one visible thread with the stack trace:
ippiReconstructCoeffsInter_MPEG4_1u16s

Like I said this only happends when fast forwarding 4x or faster.
Someone that has experienced anything like this or has any comments?

Thanks Fredrik



0 Kudos
12 Replies
fre3de
Beginner
554 Views
Hi Nikolay!

The crash occurs when fast forwarding the file with Windows Media Player.
I don't think any frames are skipped.
I have tested to copy the input buffer to a larger buffer to make sure that there always is some bytes after the buffer which are not used. With that in place the decoder don't crash. However this ugly workaround decreases the performance...
What's strange is that ShowBits (See my previous post) always seems to read 1-3 bytes outside the input buffer. Also at normal playback. Maybe someone could test if you experience the same thing? It doesn't seems to cause any problems except when fast forwarding at high speeds.
Your help is appreciated.
/Fredrik
0 Kudos
fre3de
Beginner
554 Views
Just want to correct myself, 1-3 bits outside the buffer, not bytes.
To test you could add the following lines to mp4_ShowBits:

if ((pInfo->bufptr + 3) <= (pInfo->buffer + pInfo->buflen))
{
OutputDebugString("Read outside the buffer!");
}
/Fredrik
0 Kudos
fre3de
Beginner
554 Views
The input buffer is no different.
The crash occurs when there is no readable data after the frame buffer.
Like I said ShowBits always seem to read 1-3 bits outside the buffer. In normal playrate this goes fine because the data after the inputbuffer is accessable. But when fast forwarding it sometimes crashes when Showbits tries to read that data.

It would really help if you, or someone else, could verify if showbits reads outside the inputbuffer.

/Fredrik


0 Kudos
jimdempseyatthecove
Honored Contributor III
554 Views

Fredrik,

Let me prefix this by I do not use MPEG4 decoder, however from what I can descern of the way mp4_ShowBits is written

Ipp32u mp4_ShowBits(mp4_Info* pInfo, Ipp32s n)
{
Ipp8u* ptr = pInfo->bufptr;
Ipp32u tmp = (ptr[0] << 24) | (ptr[1] << 16) | (ptr[2] << 8) | (ptr[3]);
tmp <<= pInfo->bitoff;
tmp >>= 32 - n;
return tmp;
}

This would require that pInfo->bufptr point to a valid 4 byte buffer. However your symptoms indicate that only the bytes necessary to represent the bit field of interest are necessary (ora bug causing invalid bufptr). To correct for this you will have to massage only the bytes necessary to extract the bit field in the event ofaat end of buffer. Something along the lines of:

Ipp32u mp4_ShowBits(mp4_Info* pInfo, Ipp32s n)
{
Ipp8u* ptr = pInfo->bufptr;
Ipp32u tmp = (ptr[0] << 24);// assume always present
if(pInfo->bitoff+n > 8) tmp |= (ptr[1] << 16); // only if required
if(pInfo->bitoff+n > 16) tmp |= (ptr[2] << 8); // only if required
if(pInfo->bitoff+n > 24) tmp |= (ptr[3]); // only if required
tmp <<= pInfo->bitoff;
tmp >>= 32 - n;
return tmp;
}
Jim Dempsey
0 Kudos
fre3de
Beginner
554 Views
Hi Jim

You are right but like i wrote in my first post:
...
I implemented a check for this but now when I get a crash, all I see is one visible thread with the stack trace:
ippiReconstructCoeffsInter_MPEG4_1u16s
...
However this crash is probably also caused by "someone" reading outside the buffer. Because when I tested to always copy the input buffer to a larger buffer, I did not get any crash at all. But that decreases the performance with about 10%.

So, I think that if I solve the problem that causes showbits to read outside the buffer, I will also solve the other crash. That's why it would be fine if someone could confirm if Showbits reads outside the buffer in their implementation as well.

/Fredrik
0 Kudos
jimdempseyatthecove
Honored Contributor III
554 Views

Fredrik,

The original code would crash _only_ if the function caused a read beyond end of buffer AND when one or more of the bytes beyond the end of the buffer were in un-mapped memory. i.e would cause a page fault and where the faulting page represented un-mapped memory. If the memory just beyond the end of the buffer happened to reside in a mapped page (allocated or not allocated) then the reference to beyond the end of the buffer would not have faulted (not crashed). i.e. junk read but also ignored.

If you modify the code as I proposed in the last message you can also add assertion tests to see if the fast forward is presenting the function with a ligitimate bit offset and where the number of bits result in valid bytes of the buffer (eventhough the last byte of the quadruple is outside the buffer). The modified code as previously presented would provide protection for this case.

Now then, if you add the assertion, as suggested above, you can also test for invalid entry arguments. (And take appropriate corrective action.)

What you might notice is fast forward "may" be performing

display last frame
delete buffer
display next frame (non-existant)

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
554 Views

Fredrik,

Also note, assume buffer is 1024 bytes (0:1023). Assume also that the piece of data to be read is the first 4 bits of the last byte in the buffer. The call could legitimately specify a byte offset of 1023,a bit offset of 0, and length of 4 bits. The code as-was written would read 4 bytes beginning at byte offset 1023 (reading 3 bytes beyond the buffer) and yet return 4 bits of valid data. If those 3 bytes beyond end of buffer happen to be in mapped virtual memory then not crash would occure, if they happened to be in un-mapped memory then a crash would occure.

You may be assuming in the above example that the caller would specify a byte offset of 1020 (for you to read the last 4bytes 1020:1023), a bit offset of 24 bits and a length of 4 bits (to reference the 1st 4 bits of the last byte of the buffer). Both sets of calling args would specify a valid 4 bit piece of data yet when passing in a byte offset of 1023 the original code would be in error but the error would be benign if it so happened that the extra bytes were in mapped memory (but crash if not).

Jim

0 Kudos
jimdempseyatthecove
Honored Contributor III
554 Views

Nikolay,

>>Yes MPEG-4 bitstream functions can read outside buffer. The checking of buffer's limit decreases the speed significantly.

Then the correct programming procedure would be to allocate the decoder bufferto

RequiredBufferSize + PossibleByteReadBeyond

In this case PossibleByteReadBeyond = 3(or 4 if you permit a call to return 0 bits and called with the bit pointer after last bit in buffer).

The reason for the padd is without the padd

If by chance the end of the required buffer aligns with the end of a Virtual Memory Page boundry then accessing 1, 2 or 3 bytes beyond end of buffer _may_ access unmapped virtual memory (and crash). This may have been the reason for the program crash observed by the original poster.

With the padd, then the function can freely access 1, 2 or 3 bytes beyond end of buffer without possibility of causing crash.

Jim Dempsey

0 Kudos
fre3de
Beginner
554 Views
Hi Nikolay,

I have made performance tests on the solution with the extra inputbuffer (with 3 bytes padding in the end) and actually the decrease in performance is very small.
This will be my solution but I don't like it...
I can't really understand why this should be necessary. Even with the checks (like the ones Jim posted) in Showbits the decoder does crash somewhere in the intel-libs. (stacktrace ippiReconstructCoeffsInter_MPEG4_1u16s). If you always have to have a +3 bytes inputbuffer to make sure to not get a crash, you could think that should be mentioned in the docs... maybe it is? :)

/Fredrik
0 Kudos
fre3de
Beginner
554 Views
Hi Jim!

I do have control over the allocation of the input buffer but that is an ugly fix for something that is wrong in the decoder, and it would not solve the problem for already existing recordings + the recordings would be 3 bytes bigger per frame. :) Like you say it would not have an negative affect on the performance though...
But I will go for the extra copying, thanks for your help.
/Fredrik

0 Kudos
jimdempseyatthecove
Honored Contributor III
554 Views

Fredrik,

You are missing the point. You allocate 3 bytes larger than frame (I suggest allocating 4 bytes larger than frame). Then declare the buffer size as the correct size for the frame (3 bytes less than allocated size or 4 bytes less than allocated size). Note, if you allocate multiple frame buffers at once,then allocate one more frame buffer than necessary but use one less than allocated. The extra buffer will contain the 3 bytes of padd but is otherwise never used.

From looking at the original function it looks like whomever defined the format is using bigendien internal format (thus requiring you to swap byte order). If you are going to add the copy bufferwhy not byte swap during the copy then not swap during the bit field extraction?

Jim Dempsey

0 Kudos
fre3de
Beginner
554 Views
Hi Jim,

I am not sure what you think that I am missing?
This is what I do in the decoder:
1. Allocate a buffer (copyBuffer) that most likely is big enough to contain a frame. (this is just done once.)
2. Check that the copybuffer is > inputbuffer +4, if not, reallocate to the new size (+1000 to avoid many new allocations).
3. Copy the inputbuffer to the copybuffer.
4. Setbufferpointer to the copybuffer. The correct inputlength is used.

/Fredrik
0 Kudos
Reply