Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

MFXVideoCORE_SyncOperation returns MFX_ERR_DEVICE_FAILURE

dr_asik
Beginner
1,023 Views

I have code that calls MFXVideoCORE_SyncOperation in a loop until it stops returning MFX_WRN_IN_EXECUTION and then verifies that the error code was MFX_ERR_NONE. I specified a timeout of 500 milliseconds. Generally this is well enough for the operation to complete. However, from time to time and without prior warning, I will get MFX_WRN_IN_EXECUTION for 3-4 turns of the loop (~2 seconds) and then the next call to SyncOperation returns MFX_ERR_DEVICE_FAILURE. This is bad because I have to reset everything.

I tried running the "Intel Media SDK Tracer" to get more information, but even if I check "Per-frame logging" it doesn't give me any info past initialization, which seems to be working fine - it does decode fine until it returns the error message above.

I also tried running the system analyzer, but now that one crashes after printing the following:

The following versions of Media SDK API are supported by platform/driver:

        Version Target  Supported       Dec     Enc

        1.0     HW      No

Last lines in the trace log when I run this tool are:

INFO :invoking LoadLibrary(C:\Program Files\Intel\Media SDK 2014 R2 for Clients\tools\mediasdk_sys_analyzer\libmfxsw32.dll)
INFO :can't find DLL: GetLastErr()=0x7e
INFO :loading default library libmfxsw32.dll
INFO :invoking LoadLibrary(libmfxsw32.dll)
INFO :loaded module C:\Program Files\Intel\Media SDK 2014 R2 for Clients\bin\win32\libmfxsw32.dll
INFO :MFXInit(MFX_IMPL_SOFTWARE,ver=1.1,session=0x0051E368)

So it seems that it crashes afterwards.

I'm on Windows 8.1 64-bit with an Intel HD 4600 not connected to a monitor. I have two monitors connected to two separate NVIDIA cards.

 

0 Kudos
17 Replies
dr_asik
Beginner
1,023 Views

Also, Windows reports the driver as 10.18.10.3907 dating from 2014-08-05.

0 Kudos
Sravanthi_K_Intel
1,023 Views

Hello there,

The following forum thread discusses the issue (Dispatcher issue) you are seeing with multiple monitors - https://software.intel.com/en-us/forums/topic/519877. And it also discusses a fix for the same. This is a temporary patch to fix the issue (we are hoping to get this fixed in the next release). Hope this helps.

0 Kudos
dr_asik
Beginner
1,023 Views

Thanks for your reply. Actually I realized I was running the original 2014 code rather than the more recent 2014 R2 release. Simply updating to that seems to have fixed the issue (I've not seen random MFX_ERR_DEVICE_FAILED since then), however I still applied the patch suggested in the thread.

0 Kudos
dr_asik
Beginner
1,023 Views

Confirmed NOT fixed. We are still getting MFX_ERR_DEVICE_FAILED in the scenario described above even with the latest release and the suggested patch applied. Please advise.

0 Kudos
Sravanthi_K_Intel
1,023 Views

Hello there - I re-opened this thread, and have asked some driver experts to take a look at it. In the meantime, can you let us if you are using DirectX 9 or 11? If you are using 9, this is a known DX9 limitation. DX9 operates under the assumption that there is a display monitor attached - it fails if it cannot find one. Since MSDK uses DX9, it fails as well. This was a feature added in DX11 by Microsoft, btw.

0 Kudos
dr_asik
Beginner
1,023 Views

The implementation we pass to MFXVideoSession::Init is MFX_IMPL_HARDWARE_ANY, and for the mfxVideoParams::IOPattern we are using MFX_IOPATTERN_OUT_SYSTEM_MEMORY. We are only using the MSDK for hardware-accelerated decoding into system memory. We don't specify usage of any version of DirectX anywhere in the code. I must add that the approach works fine in general even with no monitor attached - the errors are quite random in nature.

0 Kudos
Sravanthi_K_Intel
1,023 Views

Okay, so far you have brought up the following issues - MFX_ERR_DEVICE_FAILURE occurrences, sys analyzer and tracer errors. Here are some suggestions we have to isolate the problem. Can you run the following sequence of steps, and let us know if it "WORKS" or you see "MFX_ERR_DEVICE_FAILURE"? If you can provide this information, we can isolate the issue you are seeing. 

  1. Run sys analyzer w/o tracer 
  2. Remove/disable dGfx
  3. Run sys analyzer w/o tracer
  4. Run prebuilt decode
  5. Run prebuilt encode
  6. Try your application
  7. Re-enable dGfx
  8. Do steps 4-6 again

When you run the above sequence of steps, let us know what stages run without any issue ("WORKS") and where/when you see the error again. This can be really helpful. Thanks.

0 Kudos
dr_asik
Beginner
1,023 Views

When you say remove/disable dGfx, you mean connecting monitors only to the Intel graphics output and removing the discrete graphics cards?

0 Kudos
Sravanthi_K_Intel
1,023 Views

Yes, that is right. 

0 Kudos
Sravanthi_K_Intel
1,023 Views

Hello there, here are some updates regarding the issue you are seeing after discussing with some developers. Since you are seeing the error rarely and at random times, the dispatcher/device drivers hypothesis "may" not be the issue.

1. Can you please let us know what codec you are decoding - VC1, MPEG2, or H264

2. Is there a change in any of the video format parameters in the input stream - resolution change in the input stream for example? If so, there is a possibility it is not being handled correctly (i.e. not handling Reset and Sync operations in correct order, not re-allocating surfaces when MFX_ERR_INVALID_VIDEO_PARAM is returned).

3. Can you please capture the input video stream and send it to us - that can be very helpful to us to find the issue.

If you got your tracer working, your per frame tracer log will show any format changes in the input stream as MRX_WRN_PARAM_CHANGED or MFX_ERR_INCOMPATIBILE_VIDEO_PARAM. Let me know if you can get the tracer log, else I can send you a working tracer application. Also, if you have results from the experiments we agreed to in previous 2 posts, please let us know the results.

 

0 Kudos
dr_asik
Beginner
1,023 Views

So I did a bunch of testing. My computer has Intel HD 4600 Graphics with the latest (2014-08-05) drivers, an NVIDIA GTX 550 in the first PCI-Express slot and an NVIDIA GTX 750 Ti in the second PCI-Express slot, running drivers 344.11. Running Win8.1 64-bit.

So I tried the following:

  • 1 monitor on Intel HD Graphics: no repro
  • 1 monitor on Intel and 1 on GTX 550: no repro
  • 1 monitor on Intel and 1 on GTX 750 Ti: no repro
  • 1 monitor on GTX 750 Ti: repro
  • 1 monitor on GTX 550: no repro update: repro
  • 1 monitor on GTX 550 and 1 on GTX 750 Ti: repro

So basically it repros when no monitor is connected to the Intel and one monitor is connected to the GTX 750 Ti, as far as I can tell. Activating or deactivating devices not connected to monitors in device manager seems to have no influence on the results.

There is no format change in the stream when the error happens and it happens at random points, never twice in the same place. Sometimes it repros immediately ,sometimes it takes several minutes. This happens with a variety of streams with different resolutions, some low, some very high; at low and high framerates; all are H.264 however. Only one stream at a time is decoding.

The system analyzer application does work when the tracer is not attached, however it strangely reports that HW is not supported when I don't have a monitor plugged into the Intel HD Graphics, even though I'm running Windows 8.1 and even though when I query for HW availability from my application (MFXVideoSession::Init using MFX_IMPL_HARDWARE_ANY) it returns success. Perhaps that has something to do with the problem?

 

0 Kudos
Sravanthi_K_Intel
1,023 Views

Thank you for the information. The results are interesting - especially since there is no clear pattern and 1 monitor on GTX 550 configuration is working fine. Based on these results, can you please give us some follow-up information? I wish I had a simpler answer, but we suspect this issue may have more to do with NVidia/Optimus, or DX9 drivers. Can you tell us if the NVidia cards came with the system you are using or were they external?

1. For the last 3 scenarios above, can you please send us the dxDiag and sysAnalyzer output?

(One other thing - When using MFX_IMPL_HARDWARE_ANY, you may want to OR it with MFX_IMPL_VIA_D3D9 or MFX_IMPL_VIA_D3D11, dependong on your system config. Else, DX11 is considered default configuration in the latest releases on MSDK.)

0 Kudos
Sravanthi_K_Intel
1,023 Views

Hello there - "This morning I reproduced it with one monitor on the GTX 550 Ti. " --> This is good news, at least there is a clear pattern now. Thanks for the dxDiag info - I have passed it along to the driver experts for follow-up, and will get back to you ask soon as we have more information.

0 Kudos
Sravanthi_K_Intel
1,023 Views

(In interest of other customers watching this thread - please update to latest driver/product)

We have been trying to improve the experience with the usage of multi-monitor configuration, and have been making progress. The common limitation with this support is the sheer number of combinations that need to be validated and tested. Having said that, our latest driver releases have had some fixes and we highly recommend you update to the latest product and the drivers.

If the issue persists, please send the configuration (system/driver/product etc) details and reproduce the issue with our SAMPLES or TUTORIALS that can be found here - https://software.intel.com/en-us/intel-media-server-studio-support/code-samples

0 Kudos
Matej_K_1
Beginner
1,023 Views

Hi Sravanthi,

I'm really hoping you're still working on this. The issue persists even with latest drivers and Media SDK. It is very frustrating for both us and our customers. I'm attaching trace an analyzer output. It shows the encoding process up until SyncOperation returns MFX_ERR_DEVICE_FAILURE.

0 Kudos
Matej_K_1
Beginner
1,023 Views

After spending couple of days on this, I found out that the following conditions need to be met in order to experience the problem:

  1. Display connected to dedicated GPU and no display connected to the Intel GPU. The other way around everything works perfectly.
  2. I was only able to reproduce this while transcoding ("simple" encode-only will not trigger the error), but only if the decoder was not quicksync. For example libav h.264 decoding + quicksync h.264 encoding reliably trigger the error, but quicksync h.264 decoding + quicksync h.264 encoding doesn't (at all). 
  3. The resolution must be less than 1080p. This is quite peculiar, I was never able to reproduce this while encoding 1080p video
  4. It is only reproducible with async depth > 1
  5. It doesn't matter whether there is actual video data in the framebuffer or just zeros. 

Point 2 is most problematic. No matter what I try I can't reproduce this outside of our application. Maybe it is extra CPU utilization from the decoder that contributes to the error? When I remove the decoding from the process completely (just "generating" empty frames) I do not get the error.

So I'm unable to reproduce this with MFX examples / tutorials, but it's not because of lack of trying. I'm however able and willing to provide customized binary of our application that will convert a file, hopefully failing in the process. Please advise.

0 Kudos
Sravanthi_K_Intel
1,023 Views

Hi Matej,

Thanks for getting back on this. Regardnig your message "Display connected to dedicated GPU and no display connected to the Intel GPU" -> is your implementation using "MFX_IMPL_HARDWARE|MFX_IMPL_VIA_D3D11"? The limitation of D3D9 is it needs the monitor connected to the iGfx. but this was removed in D3D11 implementation.

Reg (2, 4), there is implementation difference in libav decode and MSDK implementation. For one, MSDK uses asynchronous non-blocking calls and has a pipeline approach. So, integrating MSDK decoder with MSDK encoder works without issue because there is a pipeline and command-queue method both adhere to. Also, aysncDepth>1 is the case when you can process multiple frames (or surfaces) in parallel and wait for them to sync to move forward. When async=1, we wait for every frame/surface to be complete before proceeding with the next one (so inherently, you are not non-blocking and waiting for the frame to be complete).

Reg (3), this is peculiar indeed. 

I have referred your thread to an expert who has prior experience in dealing with libav+MSS integration. I will send you an update as soon as we have more info.

Can you send us the reproducer code as well? I have sent you my email ID in the message - please send the code to that email address.

0 Kudos
Reply