Solved: oneVPL sw-impl fails with MFX_ERR_ABORTED on DecodeFrameAsync

gonta · ‎03-17-2023

Intel OneVPL version: 2023.0.0

Windows10

Visual Studio 2022

Hi,

I am developing an app based on the sample_decode code.
My app receives an AVC/HEVC stream, extracts a GOP, and calls CDecodingPipeline::RunDecoding(). Once it returns MFX_ERR_MORE_DATA, the app extracts another GOP and calls CDecodingPipeline::RunDecoding() again, and so on.

The hardware implementation works fine, but the software implementation does not decode any frames and returns MFX_ERR_MORE_DATA on the second call, and results in MFX_ERR_ABORTED on the third call.

Can you suggest how I can work around this error?

I have modified the sample_decode code so that this situation can be reproduced.
I have attached the code and its log.

Thank you.

RemyaP_Intel · ‎04-20-2023

Hi,

Yes. As mfxVideoParam::AsyncDepth is set to 1, there is no way to change the size of the internal queue.

--

Regards,

Remya Premdas

View solution in original post

RemyaP_Intel · ‎03-21-2023

Hi,

Thank you for posting in Intel communities.

Could you please share the processor details with us.

We tried running the sample decode code present in Github on our Windows machine and its works fine. We are checking internally with our team, as the issue could be with the code you have developed.

We will get back to you at the earliest.

Regards,

Remya Premdas

gonta · ‎03-21-2023

Thank you for response.

I have confirmed in the following two environments.

(1) Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz 3.60 GHz / Windows 10 Pro 21H2 64bit

(2) 11th Gen Intel(R) Core(TM) i5-1145G7 @ 2.60GHz 2.61 GHz / Windows 10 Enterprise 21H2 64bit

I have built sample_decode on (1).

Note that Hardware Implementation works fine, Software one does not works.

I used input stream cars_320x240.h264, got from https://github.com/oneapi-src/oneVPL-cpu/blob/master/test/content/cars_320x240.h264.

thank you

RemyaP_Intel · ‎03-24-2023

Hi,

We checked internally with our team and understood that the CPU implementation is currently for reference only and not intended to be competitive in terms of performance or features.

In the current VPL CPU implementation, MFX_BITSTREAM_EOS from the input bitstream dataFlag is ignored by MFXVideoDECODE_DecodeFrameAsync() which can cause this type of issue in case of longer bitstreams.

Also, CPU implementation decode error recovery is limited. So, using the VPL GPU implementation is highly recommended.

If this resolves your issue, make sure to accept this as a solution. This would help others with a similar issue. Thank you!

Regards,

Remya Premdas

gonta · ‎03-27-2023

Thank you for the response.
Sorry to hear that.
Hopefully this will be fixed in future editions!

I tried changing MFX_BITSTREAM_EOS not to be set, but that did not change the result.

On the other hand, from my research, calling DecodeFrameAsync() with the argument "bs" as null seemed to reproduce the problem.
I suspect that this indicates the end of the stream to the CPU implementation, which then causes later input streams to not be processed, resulting to the error.
So, instead of calling DecodeFrameAsync() with the "bs" argument null, I changed it to break and exit the loop to avoid the problem.

Is this workaround reasonable?

RemyaP_Intel · ‎03-29-2023

Hi,

We are checking on this with the team. We will get back to you with an update.

Regards,

Remya Premdas

RemyaP_Intel · ‎04-04-2023

Hi,

We checked with our internal team and what we understood is at the end of the bitstream, the application continuously calls the MFXVideoDECODE_DecodeFrameAsync() function with a NULL bitstream pointer to drain any remaining frames cached within the decoder until the function returns mfxStatus::MFX_ERR_MORE_DATA and no more frames are in internal queues.

Breaking out of the loop before the drain is likely to lose the last few frames unless the input is configured for low latency. So, this may be an acceptable workaround for this particular bitstream, but it's not a general solution.

Is your query answered? If yes, can we go ahead and close this case?

Regards,

Remya Premdas

gonta · ‎04-05-2023

Thank you for your response.
Let me just check a few more things.

You mentioned "internal queue", is there any way to control this size? I believe that if we could set this size to 1, we would not lose the last few frames.
I thought mfxVideoParam::AsyncDepth would handle this, but in the CPU implementation, this value is set to 1. Or is the control of this parameter also not implemented in the CPU implementation?

Thank you.

RemyaP_Intel · ‎04-18-2023

Hi,

Sorry for the delay in response. VPL CPU implementation is synchronous (AsyncDepth value is set to 1) only. Also, note that VPL CPU is only a reference implementation and there won't be any more changes or maintenance in the future.

Regards,

Remya Premdas

gonta · ‎04-18-2023

Thank you for the response.

You mention that the CPU implementation is "synchronous" (AsyncDepth value is set to 1) only,
while you also mention that there is an "internal queue".

Would you please tell me which is correct?

Also, am I correct in understanding that there is no way to set the size of the "internal queue"?

Thank you.

RemyaP_Intel · ‎04-20-2023

Hi,

Yes. As mfxVideoParam::AsyncDepth is set to 1, there is no way to change the size of the internal queue.

--

Regards,

Remya Premdas

gonta · ‎04-23-2023

I understood the CPU implementation as follows:

1. it is a synchronous API
2. it has a multi-stage internal queue
3. there is no API to control the number of internal queue stages

It seemed unnatural for me to have multiple internal queues even though it is a synchronous API, but I understood that it is not improbable and that is the way it is.

Thank you for answering my question!

RemyaP_Intel · ‎04-24-2023

Hi,

Glad to know that your query is resolved. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.

Regards,

Remya Premdas