Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
New Contributor II
66 Views

Exception at encoding core, MFXVideoCORE_SyncOperation permanently returns MFX_ERR_UNKNOWN after it

Jump to solution

Application pseudo code:
1. create session
2. create h264 encoder on that session, init it, prepare buffers
3. encode frames from raw nv12-file
4. flush encoder, wait for last output frame
5. release encoder and all resources, except session
6. goto 2

Input file and encoding parameters are the same on every cycle execution. Debug compillation, x86. MFX_IMPL_SOFTWARE only. imsdk 1.7.
After some time after application start (from several minutes to several days, in different runs) i got at debugger output:
"First-chance exception at 0x595798f3 in imsdk_h264enc.exe: 0xC0000005: Access violation reading location 0x000000c0."
And MFXVideoCORE_SyncOperation returns MFX_ERR_UNKNOWN after that, subsequent MFXVideoCORE_SyncOperation calls (with the same arguments) permanently return MFX_ERR_UNKNOWN.

Exception point details:
C:\PLANG\Intel\Media SDK 2013 R2\bin\win32\libmfxsw32.dll, offset 0x001988F3 from section ".text" begin
595798D7 mov ecx,dword ptr [edi+1D34h]
595798DD push edx
595798DE lea edx,[ecx+eax*2]
595798E1 mov eax,dword ptr [esi+20h]
595798E4 lea ecx,[eax+edx*2]
595798E7 mov edx,dword ptr [edi+34h]
595798EA mov ecx,dword ptr [edx+ecx*4]
595798ED mov eax,dword ptr [edi+1B94h]
595798F3 add ecx,dword ptr [eax+0C0h] *********** it is here *********
595798F9 mov edx,dword ptr [esi+3Ch]
595798FC push ecx
595798FD push edx
595798FE push edi
595798FF lea ecx,[ebp+0Bh]
59579902 mov edx,esi

PS:
Now, I realized that there is a little chance that problem is provoked by crt versions conflict (i got "LINK : warning LNK4098: defaultlib 'MSVCRT' conflicts with use of other libs" at build, because of debug application build).
I'll make own mfx_dispatch build with the same settings as at application (debug/mt-dll-crt/vs2010), use it to rebuild application, and run tests again.
I'll write the results here in a couple of days.

0 Kudos

Accepted Solutions
Highlighted
Employee
84 Views

Hi dj_alek,

I've asked my colleague to get back to on this since he explored thie question at an earlier stage.

Regarding the SW DLL exception issue. We plan to resolve the issue in Media SDK 2014, which will be released at the end of this year.

Regards,
Petter

View solution in original post

0 Kudos
69 Replies
Highlighted
Employee
59 Views

Hi,

please let us know if you're able to reproduce the issue again based on your recent findings. If so, since the issue seems somewhat tricky to reproduce, please try to provide code reproducer.

Regards,
Petter 

0 Kudos
Highlighted
New Contributor II
59 Views

The original post was made on a large application in which imsdk was tested (on reliability, stability, memory/handle leaks, etc - we are choosing an encoder for our applications).
I made a small excerpt from that application in order to put it here. It was builded with genuine libmfxmd.lib (shipped with imsdk), release compillation (to avoid MSVCRT conflicts).

When I started this small application, I got an exception in a different place (after 2 hours): "First-chance exception at 0x507ff45b in imsdk_h264enc.exe: 0xC0000005: Access violation reading location 0x04ffb760."
Exception point:
C:\PLANG\Intel\Media SDK 2013 R2\bin\win32\libmfxsw32.dll, offset 0x0019E45B from section ".text" begin
507FF43E test dl,dl
507FF440 js 507FF447
507FF442 mov ecx,dword ptr [ecx+70h]
507FF445 jmp 507FF452
507FF447 mov edx,dword ptr [ecx+7Ch]
507FF44A mov ecx,dword ptr [ecx+74h]
507FF44D add edx,eax
507FF44F mov dl,byte ptr [edx+esi]
507FF452 add eax,esi
507FF454 lea eax,[ecx+eax*4]
507FF457 test dl,dl
507FF459 jne 507FF48A
507FF45B movzx ecx,word ptr [eax] *********** it is here *********
507FF45E cmp cx,0FFFFh
507FF462 jl 507FF48A
507FF464 cmp cx,1
507FF468 jg 507FF48A
507FF46A movzx eax,word ptr [eax+2]
507FF46E cmp ax,0FFFFh
507FF472 jl 507FF48A
507FF474 cmp ax,1
507FF478 jg 507FF48A
507FF47A cmp byte ptr [ebp-15h],dl
507FF47D mov byte ptr [ebp-1Eh],dl
507FF480 sete bl

Next small application run resulted (after 10 hours) in: "First-chance exception at 0x558aeb6a in imsdk_h264enc.exe: 0xC0000005: Access violation reading location 0x000000c0."
And got a something different aftermath - MFXVideoCORE_SyncOperation falled into infinite MFX_WRN_IN_EXECUTION returns after that.
Exception point:
C:\PLANG\Intel\Media SDK 2013 R2\bin\win32\libmfxsw32.dll, offset 0x001FDB6A from section ".text" begin
558AEB47 push 0
558AEB49 push 0
558AEB4B movsx edx,dl
558AEB4E and ebx,3
558AEB51 push ebx
558AEB52 mov ebx,dword ptr [ebp+18h]
558AEB55 push 10h
558AEB57 push ebx
558AEB58 movsx ebx,byte ptr [edx+esi+30B0h]
558AEB60 mov edx,dword ptr [esi+edx*4+302Ch]
558AEB67 mov esi,dword ptr [ecx+ebx*4]
558AEB6A add esi,dword ptr [edx+0C0h] *********** it is here *********
558AEB70 mov edx,dword ptr [ebp-8]
558AEB73 mov dword ptr [ebp+0Ch],eax
558AEB76 mov ecx,eax
558AEB78 sar ecx,2
558AEB7B imul ecx,edi
558AEB7E add ecx,dword ptr [ebp-0Ch]
558AEB81 sar edx,2
558AEB84 add ecx,esi
558AEB86 add edx,ecx
558AEB88 push edi
558AEB89 push edx
558AEB8A mov edx,dword ptr [ebp-4]

x86 application, MFX_IMPL_SOFTWARE, imsdk 1.7, msvs2010 (with latest updates), i7-3770, win7x64.
nv12-file that was used as input in the tests is here: https://docs.google.com/file/d/0B8SCkOT4os4HNXVHdHpFT0gzYkk/edit?usp=sharing

I suspect the most likely cause of the exceptions is the large in-pts values.
Good luck in your investigation!

0 Kudos
Highlighted
New Contributor II
59 Views

One more point which results in MFX_ERR_UNKNOWN: "First-chance exception at 0x5754cc6c in imsdk_h264enc.exe: 0xC0000005: Access violation reading location 0x000000c0."
C:\PLANG\Intel\Media SDK 2013 R2\bin\win32\libmfxsw32.dll, offset 0x001EBC6C from section ".text" begin
5754CC41 movzx eax,byte ptr [edi+58236E90h]
5754CC48 movzx ecx,byte ptr [edi+58236E80h]
5754CC4F shl eax,4
5754CC52 add eax,ecx
5754CC54 mov ecx,dword ptr [ebp-1A8h]
5754CC5A add eax,dword ptr [ecx+1B8h]
5754CC60 mov ecx,dword ptr [ebp-1B0h]
5754CC66 mov ecx,dword ptr [ecx+1B94h]
5754CC6C mov esi,dword ptr [ecx+0C0h] *********** it is here *********
5754CC72 add esi,dword ptr [ebp-1E8h]
5754CC78 test dl,dl
5754CC7A jne 5754CD0A
5754CC80 movzx edx,byte ptr [eax]
5754CC83 mov byte ptr [esi],dl
5754CC85 movzx ecx,byte ptr [eax+1]
5754CC89 mov byte ptr [esi+1],cl
5754CC8C movzx edx,byte ptr [eax+2]
5754CC90 mov byte ptr [esi+2],dl
5754CC93 movzx ecx,byte ptr [eax+3]
5754CC97 mov edx,dword ptr [ebp-204h]

Any ideas?

0 Kudos
Highlighted
Black Belt
59 Views

First please run it under windbg when you can perform automated analysis of the access violation exception.By looking at assembly code examples it seems that exception occures when loading pointer to some structure maybe 507FF454 lea eax,[ecx+eax*4]  can you inspect with windbg what is the content of eax?

0 Kudos
Highlighted
New Contributor II
59 Views

Got one under windbg.

(5d0.1278): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\PLANG\Intel\Media SDK 2013 R2\bin\win32\libmfxsw32.dll -
eax=00000000 ebx=01f6fbb0 ecx=00007000 edx=0324dd30 esi=02596454 edi=04c31440
eip=55ae98f3 esp=01f6f7f8 ebp=01f6f810 iopl=0 nv up ei pl zr na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010246
libmfxsw32!MFXVideoVPP_GetVPPStat+0x18bd83:
55ae98f3 0388c0000000 add ecx,dword ptr [eax+0C0h] ds:002b:000000c0=????????

Code around it (offset 0x001998F3 from libmfxsw32.dll begin):
[plain]55ae98cb 8b4e3c mov ecx,dword ptr [esi+3Ch]
55ae98ce 56 push esi
55ae98cf 57 push edi
55ae98d0 e82b120600 call libmfxsw32!MFXVideoVPP_GetVPPStat+0x1ecf90 (55b4ab00)
55ae98d5 eb32 jmp libmfxsw32!MFXVideoVPP_GetVPPStat+0x18bd99 (55ae9909)
55ae98d7 8b8f341d0000 mov ecx,dword ptr [edi+1D34h]
55ae98dd 52 push edx
55ae98de 8d1441 lea edx,[ecx+eax*2]
55ae98e1 8b4620 mov eax,dword ptr [esi+20h]
55ae98e4 8d0c50 lea ecx,[eax+edx*2]
55ae98e7 8b5734 mov edx,dword ptr [edi+34h]
55ae98ea 8b0c8a mov ecx,dword ptr [edx+ecx*4]
55ae98ed 8b87941b0000 mov eax,dword ptr [edi+1B94h]
55ae98f3 0388c0000000 add ecx,dword ptr [eax+0C0h] ds:002b:000000c0=???????? *********** it is here *********
55ae98f9 8b563c mov edx,dword ptr [esi+3Ch]
55ae98fc 51 push ecx
55ae98fd 52 push edx
55ae98fe 57 push edi
55ae98ff 8d4d0b lea ecx,[ebp+0Bh]
55ae9902 8bd6 mov edx,esi
55ae9904 e8f7b50500 call libmfxsw32!MFXVideoVPP_GetVPPStat+0x1e7390 (55b44f00)
55ae9909 8b4d0c mov ecx,dword ptr [ebp+0Ch]
55ae990c 8b5658 mov edx,dword ptr [esi+58h]
55ae990f 8901 mov dword ptr [ecx],eax
55ae9911 8a450b mov al,byte ptr [ebp+0Bh][/plain]

Call stack:
[plain]# ChildEBP RetAddr Args to Child
00 01f6f810 55ae9ca4 04c31440 01f6f864 04c31440 libmfxsw32!MFXVideoVPP_GetVPPStat+0x18bd83
01 01f6fba0 55980e07 01f6fc10 55980cde 025bd9f4 libmfxsw32!MFXVideoVPP_GetVPPStat+0x18c134
02 01f6fc10 5598207d 026c0020 05e67aac 00031213 libmfxsw32!MFXVideoVPP_GetVPPStat+0x23297
03 01f6fc70 55968139 026c0020 05d15f40 00000001 libmfxsw32!MFXVideoVPP_GetVPPStat+0x2450d
04 01f6fda8 55bdb277 0033dba8 f61391bc 00000000 libmfxsw32!MFXVideoVPP_GetVPPStat+0xa5c9
05 01f6fde0 55bdb301 00000000 01f6fdf8 770a33aa libmfxsw32!MFXVideoVPP_GetVPPStat+0x27d707
06 01f6fdec 770a33aa 0033dcb0 01f6fe38 77a99ef2 libmfxsw32!MFXVideoVPP_GetVPPStat+0x27d791
07 01f6fdf8 77a99ef2 0033dcb0 789f6a8f 00000000 kernel32!BaseThreadInitThunk+0x12
08 01f6fe38 77a99ec5 55bdb29d 0033dcb0 00000000 ntdll!RtlInitializeExceptionChain+0x63
09 01f6fe50 00000000 55bdb29d 0033dcb0 00000000 ntdll!RtlInitializeExceptionChain+0x36[/plain]

Was this information helpful? Or you need exception at "lea eax,[ecx+eax*4]" point exactly?

0 Kudos
Highlighted
Black Belt
59 Views

>>>55ae98f3 0388c0000000 add ecx,dword ptr [eax+0C0h] ds:002b:000000c0=????????>>>


This is the culprit of the access violation error.

Go backward from this call site  55ae9ca4 04c31440 01f6f864 04c31440 libmfxsw32!MFXVideoVPP_GetVPPStat .


use command ln(list nearest symbols)

0 Kudos
Highlighted
New Contributor II
59 Views

I have already restarted application under windbg, so need to wait again...

0 Kudos
Highlighted
Black Belt
59 Views

Try different commands while debugging:

~*KL

~

bp libmfxsw32!MFXVideoVPP_GetVPPStat (breakpoint on faulting function)

p (tracing)

p

0 Kudos
Highlighted
New Contributor II
59 Views

Got one more, at another place (and I didn't do anything with windbg afterwards - it was left at the exception point).

(11a4.156c): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\PLANG\Intel\Media SDK 2013 R2\bin\win32\libmfxsw32.dll -
eax=05cabfa0 ebx=05aa2500 ecx=05cb6f20 edx=05ce6d00 esi=00000000 edi=01ecea70
eip=558cf45b esp=01ecdee8 ebp=01ecdf60 iopl=0 nv up ei pl zr na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010246
libmfxsw32!MFXVideoVPP_GetVPPStat+0x1918eb:
558cf45b 0fb708 movzx ecx,word ptr [eax] ds:002b:05cabfa0=????

Code around it (offset 0x0019F45B from libmfxsw32.dll begin):
[plain]558cf439 03d0 add edx,eax
558cf43b 8a1432 mov dl,byte ptr [edx+esi]
558cf43e 84d2 test dl,dl
558cf440 7805 js libmfxsw32!MFXVideoVPP_GetVPPStat+0x1918d7 (558cf447)
558cf442 8b4970 mov ecx,dword ptr [ecx+70h]
558cf445 eb0b jmp libmfxsw32!MFXVideoVPP_GetVPPStat+0x1918e2 (558cf452)
558cf447 8b517c mov edx,dword ptr [ecx+7Ch]
558cf44a 8b4974 mov ecx,dword ptr [ecx+74h]
558cf44d 03d0 add edx,eax
558cf44f 8a1432 mov dl,byte ptr [edx+esi]
558cf452 03c6 add eax,esi
558cf454 8d0481 lea eax,[ecx+eax*4]
558cf457 84d2 test dl,dl
558cf459 752f jne libmfxsw32!MFXVideoVPP_GetVPPStat+0x19191a (558cf48a)
558cf45b 0fb708 movzx ecx,word ptr [eax] ds:002b:05cabfa0=???? *********** it is here *********
558cf45e 6683f9ff cmp cx,0FFFFh
558cf462 7c26 jl libmfxsw32!MFXVideoVPP_GetVPPStat+0x19191a (558cf48a)
558cf464 6683f901 cmp cx,1
558cf468 7f20 jg libmfxsw32!MFXVideoVPP_GetVPPStat+0x19191a (558cf48a)
558cf46a 0fb74002 movzx eax,word ptr [eax+2]
558cf46e 6683f8ff cmp ax,0FFFFh
558cf472 7c16 jl libmfxsw32!MFXVideoVPP_GetVPPStat+0x19191a (558cf48a)
558cf474 6683f801 cmp ax,1
558cf478 7f10 jg libmfxsw32!MFXVideoVPP_GetVPPStat+0x19191a (558cf48a)
558cf47a 3855eb cmp byte ptr [ebp-15h],dl
558cf47d 8855e2 mov byte ptr [ebp-1Eh],dl[/plain]

Call stack:
[plain]# ChildEBP RetAddr Args to Child
00 01ecdf60 558e1175 05aa2594 01ece610 04cb1440 libmfxsw32!MFXVideoVPP_GetVPPStat+0x1918eb
01 01ecf570 558c9bf1 01ecf908 558c9bf1 04cb1440 libmfxsw32!MFXVideoVPP_GetVPPStat+0x1a3605
02 01ecf900 55760e07 05ab4344 55760e07 770a1400 libmfxsw32!MFXVideoVPP_GetVPPStat+0x18c081
03 01ecf968 5576207d 028a0020 05acac2c 00035569 libmfxsw32!MFXVideoVPP_GetVPPStat+0x23297
04 01ecf9cc 55748139 028a0020 05b18b00 00000002 libmfxsw32!MFXVideoVPP_GetVPPStat+0x2450d
05 01ecfb04 559bb277 01f5dba8 ca2ac3a9 00000000 libmfxsw32!MFXVideoVPP_GetVPPStat+0xa5c9
06 01ecfb3c 559bb301 00000000 01ecfb54 770a33aa libmfxsw32!MFXVideoVPP_GetVPPStat+0x27d707
07 01ecfb48 770a33aa 01f5dcb0 01ecfb94 77a99ef2 libmfxsw32!MFXVideoVPP_GetVPPStat+0x27d791
08 01ecfb54 77a99ef2 01f5dcb0 790b9369 00000000 kernel32!BaseThreadInitThunk+0x12
09 01ecfb94 77a99ec5 559bb29d 01f5dcb0 00000000 ntdll!RtlInitializeExceptionChain+0x63
0a 01ecfbac 00000000 559bb29d 01f5dcb0 00000000 ntdll!RtlInitializeExceptionChain+0x36[/plain]

I can't understand why we do these manipulations with windbg? This is an obvious bug in libmfxsw32.dll. Trying to find an error without debug symbols and source code is an irrational waste of time. These posts are aimed to imsdk developers, not for self-troubleshooting...

Can you explain me the ultimate goal of these windbg operations?

0 Kudos
Highlighted
Black Belt
59 Views

I thought about the backtracing from the faulting IP and inspecting the context on every step in, but as you pointed it out it should be done by library developers.

0 Kudos
Highlighted
New Contributor II
59 Views

But it seems that the developers not in a hurry to deal with the problem. The topic was opened almost two weeks ago, but I have not even received "we will inspect" reply. Sad...

PS:
My guess that the large in-pts values are the cause of exceptions, was not confirmed. Replacing the line
[cpp]QWORD qwInPTS = 3000000030999; // ~ 1 year[/cpp]
to
[cpp]QWORD qwInPTS = 324000000; // 1 hour[/cpp]
gives exceptions also...

0 Kudos
Highlighted
Black Belt
59 Views

It is not easy to understand what exactly caused the access violation error.I think that pointer got somehow corrupted maybe during the function call-ret sequence two possible registers are ECX and EAX.

0 Kudos
Highlighted
New Contributor II
59 Views

Inaccurate memory management or critical sections usage, forgetting to reset some member/variable, etc, etc, etc. There are many potential sources of a such problem. Code of test application is very simple and can be verified by anyone. Such tasks need to be solved with library source code. So, we should wait for developers response...

PS:
Next start of the test application gave infinite MFX_WRN_IN_EXECUTION returning from MFXVideoCORE_SyncOperation after 1 hour of work, without any exception.

0 Kudos
Highlighted
Black Belt
59 Views

Completely agree with you.

0 Kudos
Highlighted
Employee
59 Views

Hi,

sorry for the delay on this topic, this forum item slipped between the cracks. We are looking into the issue right now.

Regards,
Petter 

0 Kudos
Highlighted
New Contributor II
59 Views

Cheers!
I have some more information. Disabling RunInIndividualThread usage (all CEncoder instances are sequentially created-used-destroyed at the same thread) gives exception also...

0 Kudos
Highlighted
Employee
59 Views

Hi,

I ran 2x 4 hour runs of the workload you provided today using the same system configuration. So far I have not observed any crashes, thus no way to debug the problem.

I will try again with a few more variations tomorrow.

Regards,
Petter 

0 Kudos
Highlighted
New Contributor II
59 Views

And I'll also run tests on a several different computers overnight.

0 Kudos
Highlighted
Black Belt
59 Views

Petter Larsson (Intel) wrote:

Hi,

I ran 2x 4 hour runs of the workload you provided today using the same system configuration. So far I have not observed any crashes, thus no way to debug the problem.

I will try again with a few more variations tomorrow.

Regards,
Petter 

It seems that the issue could be bound to specific computer(the one that access violation occures).

0 Kudos