Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Peter_B_7
Beginner
187 Views

Skylake + latest drivers = Heap corruption when closing video encoder?

Hello,

we are developing a video playback and recording system using the Intel Media SDK. Experiments are conducted on two platforms:

A) The test system based on i5 4590T + Windows 7

B) The target hardware using an i5 6600 + Windows 8.1 Embedded.

Both systems use the same SDK version. System B has the latest graphics drivers, while system A uses some close to up-to-date drivers (since the latest versions support only 6th generation CPUs for now).

The software we developed so far runs smoothly on system A but has an issue on system B. The issue is observed when using MFXVideoENCODE and having it process (at least) one frame. If done, deleting the MFXVideoENCODE object does something to the mfxSession, which causes a heap corruption when deleting the respective MFXVideoSession later on.

The issue can easily be reproduced using the older (but very convenient) simple_3_encode provided with the package mediasdk-tutorials-0.0.3. We did the following modifications:

1st) We added

    #define _CRTDBG_MAP_ALLOC

    #include "crtdbg.h"

for getting a better overview of memory related issues

2nd) We added { } braces so that the variable "session" (constructed after the sink initialization) goes out of scope after mfxENC.Close() is called and surfaces are freed.

Running the code in debug mode breaks to MFXVideoSession::Close with the message "HEAP: Free Heap block xxxxx modified at yyyyy after it was freed" showing on the output.

Playing around, we declared MFXVideoSession::m_session as a public member and set it to NULL before "session" goes out of scope. This kind of "solves" the problem in terms of crashing. However, it causes some memory to leak, which isn't really acceptable either.

Is this a known issue with current drivers or the current SDK? As said, none of this can be observed on our hardware configuration A. Also, MFXVideoDECODE works fine on our hardware configuration B.

Thanks for any advice,

Peter

0 Kudos
22 Replies
Peter_B_7
Beginner
172 Views

Addition: The same issue seems to apply to the screen capture plugin, which we use for capturing the screen's content. Destroying the MFXVideoSession after the decoder (which employs the screen capture plugin) breaks to MFXVideoSession::Close. NULLing MFXVideoSession::m_session prior to destroying MFXVideoSession works but causes memory to leak.

Thanks,

Peter

Surbhi_M_Intel
Employee
172 Views

Hi Peter, 

We don't have a known issue of heap corruption on the latest SKL driver, will it be possible for you to send us the reproducer of your issues with cmd line to run for our team to debug. 

Thanks,
Surbhi

 

AaronL
Beginner
172 Views

Peter,

I've encountered occasional heap corruption issues when I use system memory, as opposed to hardware memory, with both H.264 and HEVC encoding and very rarely when decoding.  This was an issue for me with Ivy Bridge and has continued all the way through to Skylake and through various versions of the Intel Media SDK and driver install package.  I'm fairly certain that it is caused by a bug somewhere in the Intel Media SDK and/or driver install package.  However, it isn't something that I can reproduce reliably, and as such, I didn't think there was any point to reporting it on this forum, as there was no chance that Intel support was going to do anything other than ask for a reproducer.  I spent a good deal of time going through my code and running memory analysis tools in an attempt to uncover the cause of the issue but was unable to do so.  Since switching to video memory, I have yet to encounter the problem.

AaronL

Surbhi_M_Intel
Employee
172 Views

Hi Aaron, 

Thanks for indicating that there is a possible bug somewhere in SDK which is causing the heap corruption issue, however to debug this problem we need a reproducer so understand where the bug is coming from and resolving the issue. I can understand sometime providing reproducer is difficult but any pointers like tracer calls or application(send us privately) can help to resolve issue coming in further releases. Good to know that video memory optimization has been stable for you, it's a great pointer for other customer. 
If you do come across the issue, please provide us the application/pointers to reproduce/debug the issue locally.

Thanks,
Surbhi

Peter_B_7
Beginner
172 Views

Hi all,

Aaron, thanks for the input! Unfortunately we cannot switch to video memory: We manipulate the data between decoding and encoding, and reading video memory has shown to kill performance.

Surbhi, I've attached the code that shows the issue. It is taken from mediasdk-tutorials-0.0.3\simple_3_encode, so I'm including only the changes we made. Changes made are so that command line parameters are not required anymore. The code breaks 100% of the time on the mentioned Skylake system. I've also attached a screenshot of the stack trace taken from VS; hopefully this one will help you pinpoint the exact location.

Some more information that might be helpful:

libmfxhw64.dll: File Version 7.15.12.15, Product Version: 7.0.1540.326

igdumdim64.dll: File + Product Version: 20.19.15.4352

igfxcmrt64.dll: File + Product Version: 5.0.0.1133

I believe all these file were obtained with the latest packages that were available 2-3 weeks ago.

If there is any additional information I can provide you with, please let me know; and best: also some words explaining how to obtain that information :-)

Debug versions / PDB files of the Media SDK probably cannot be made available, right?

Thanks,

Peter

AaronL
Beginner
172 Views

Peter B. wrote:

Hi all,

Aaron, thanks for the input! Unfortunately we cannot switch to video memory: We manipulate the data between decoding and encoding, and reading video memory has shown to kill performance.

I do the same, but the last step writes to video memory instead of system memory.  For example, I use IPP for color conversion from YUY2 to NV12.  Rather than writing the new pixel values into system memory, they are written into a video memory allocation.  Do everything you want in system memory until you are ready for NV12, and write that to video memory.  At least with HEVC, I've found that passing video memory to the encoder, as opposed to system memory, improves performance.

AaronL

Peter_B_7
Beginner
172 Views

Aaron, thanks again! This indeed sounds promising, so we'll give it a shot...

Thanks,

Peter

Surbhi_M_Intel
Employee
172 Views

Hi Peter, 

Sorry it took me a while to get to this post. I am looking at your post now, hoping you have latest Media SDK 2016 and latest driver installed on your system often that comes out to be the problem. Going to your application, I will test this on the Skylake system hopefully by tomorrow and discuss with expert if I am able to successfully reproduce your issue. One thing which I forgot to mention before is that tutorials have never been validated on the Skylake system, they were last validated on 4th generation i.e. haswell architecture so it might be possible to see an issue on 6th generation processor. In case you want to verify MSDK encoder on 6th generation, you can refer to samples which were validated on 6th generation core processor. 

Thanks,
Surbhi

Peter_B_7
Beginner
172 Views

Hi Surbhi,

thanks for your answer. As said, SDK and drivers are from whatever was available beginning of March 2016.

I just have checked the current SDK sample version you provided, i.e.

    Intel Media SDK Samples 2016.0.0.142\sample_encode\src

Same thing: Running the code consitently breaks to MFXVideoSession::Close which is called in CEncodingPipeline::Close(), i.e. m_mfxSession.close();

Could you reproduce the issue?

Thanks,

Peter

Sergey_Anufriev
Beginner
172 Views

Good day. 

We have a similar problem with heap corruption after calling MFXClose.

1) heap corruption occurs only if the MFXVideoENCODE_EncodeFrameAsync to be called more than once

2) heap corruption occurs at random addresses 

3) "Free Heap block XX modified at YY after it was freed", where at YY is always 4 zero bytes. Somewhere recorded in unallocated memory?

Hw&Sw : i7 6700, Win10 x64, msdk 2016 (api 1.17)

gparag1983
Beginner
172 Views

Hello,

Currently I am using Intel(R)_Media_SDK_2016.0.1 and Intel Media SDK Samples 2016 6.0.0.142.

Earlier we are releasing our product with 4th generation CPU. We are doing Mpeg and H264 encoding on IQSV.

Currently I am testing our application on 6 Generation CPU.

I attach my 6th generation system configuration images.

But with 6th generation processor, Mpeg and H264 encoding on IQSV is stop randomly on any time. Some times encoding stop in 5 minutes or some times after 1 hr. Before encoding stop, memory start increase. I test with almost all the 6th generation IQSV driver uploaded on intel site. But with all we are facing same issue.

With 4th generation CPU also we face same issue. With only Intel graphic driver version 15.33.18.64.3496, we are not facing this issue. But with all other drivers (greater than 15.33.18.64.3496), we are facing encoding stop issue on 4th generation machine also.

Regards

Parag Gandhi

gparag1983
Beginner
172 Views

Hello,

Any Update ?

Regards

Parag Gandhi

Peter_B_7
Beginner
172 Views

Hello,

we ran some more tests regarding the heap corruptions for the past hours, so here are some new observations. Interestingly the issue (on our system) seems highly related to on whether a debugger is attached to the running software:

- Running e.g. sample_encode with a debugger attached (F5 in VS) reliably crashes to MFXVideoSession::Close with the aforeposted stack trace and heap corruption message.

- Running the software without debugger (CRTL+R in VS) shows none of these issues.

As seen so far, the same applies to our software, at least when compiling in release mode: Compiling in debug mode crashes to MFXVideoSession::Close, regardless of whether a debugger is attached or not.

I am not aware of the magic happening when attaching a debugger... but as far as I can see, when compiling in release mode, the same DLLs are used, regardless of whether a debugger is attached or not. Compiling in debug mode of course can result in using different DLLs, so this could be explaining why the debug build of our software crashes even without a debugger attached. Still, I never saw a debug version of either the MFX libraries or DirectX. Any ideas what could be causing these issues?

During development and especially debugging, it seemed to me that the MFX libraries don't like debuggers very much: When using and hitting breakpoints, MFX processors (decode, encode, vpp) are often lost and return errors when calling them again.

Hoping this can be of any help to others...

Cheers,

Peter

 

Surbhi_M_Intel
Employee
172 Views

Parag, 
I see you are getting support through Naveen and you have already provided the reproducer. Continue working with him to get more insights, if communication is stuck at some point please let us know here. 

Surbhi

Surbhi_M_Intel
Employee
172 Views

Peter, 
I was able to reproduce the issue with the tutorial code you provided to reproduce the issue. I have filed an issue and reported this to validation team. Will let you know as soon as we are able to find the root cause.

Surbhi

Surbhi_M_Intel
Employee
172 Views

Sergey, 
Your problem could be similar to Peter's problem, are you also seeing issue only with debug mode and your code works well in the release mode? If not, please send us a reproducer to your problem and we will like to debug that issue as well. 

Surbhi

Sergey_Anufriev
Beginner
172 Views

Hello Surbhi,

thanks for your answer. Yes, the problem occurs in the debug mode.

But you understand that in release mode, valid data may be overwritten by these 4 bytes.

Sergey_Anufriev
Beginner
172 Views

driver version : 20.19.15.4463

msdk 2016 R2 (api 1.19)

problem still exists ...

 

Peter_B_7
Beginner
172 Views

Hello,

sorry, I have been away from work for a while...

Thanks Sergey for testing the newest SDK along with the latest driver...!

The topic is receiving more attention here now since we are getting closer to production. Doing so we have been running a lot of tests with release builds. An important use case for us is to switch between 2D and 3D encoding when the user asks for it. Hence we are destroying encoders along with their session quite often in order to free up resources for the next encoder needing a different configuration. As could be expected from the heap corruption seen in debug builds, our application every now and then crashes while destroying the session: I.e. a debug output message printed just prior to session destruction is seen, while a message printed right after session destruction is not seen).

Are there any news on this? Is the issue still investigated?

Thanks,

Peter

 

OTorg
New Contributor II
19 Views

It may be the same (or closely related) issue as described here:

https://software.intel.com/en-us/forums/intel-media-sdk/topic/475624?page=1#comment-1779278

Reproducer applications (that pretty quickly discover the problems) are also provided.

Reply