- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We're using Intel MSDK to decode H.264 video streams in our software, rendered with D3D11/DXGI. We discovered when stopping and starting streams we had a memory leak in GPU memory (observed primarily through ProcessExplorer/"System GPU memory"). We verified with DX debug layer that all objects were released correctly, and then we turned to checking the functionality in "sample_decode" to verify if the problems were in our implementation.
Unfortunately (?) the memory issues can be seen even with sample_decode, with minor changes. The only changes done in code are:
- Added a for-loop to run the same decoder/render task 5 times instead of only 1 time.
- Ignoring the result of the RegisterClass call (to not abort since class already registered)
Code is attached, along with a compiled debug .exe.
* sample_decode based on "2018 R2" samples
* GPU memory issues seems to happen when rendering with D3D11
* Using D3D9 no memory leak could be observed
* Tested on multiple machines, but results below were running on a Skylake processor (HD 530)
D3D11 rendered
h264 -hw -d3d11 -r -async 4 -rgb4 -i c:\temp\bbb_sunflower_1080p_30fps_normal.mp4.264
=> GPU-memory increasing
D3D11 NOT rendered
h264 -hw -d3d11 -async 4 -rgb4 -i c:\temp\bbb_sunflower_1080p_30fps_normal.mp4.264
=> GPU-memory returns to 0 between each run
D3D9 rendered
h264 -hw -d3d -r -async 4 -rgb4 -i c:\temp\bbb_sunflower_1080p_30fps_normal.mp4.264
=> GPU-memory returns to 0 between each run
D3D11 software rendered
h264 -sw -d3d11 -r -async 4 -rgb4 -i c:\temp\bbb_sunflower_1080p_30fps_normal.mp4.264
=> GPU-memory increasing
See screenshots from Process Explorer below.
D3D11 - rendering
D3D11 - not rendering
D3D9 - rendering
Could somebody from Intel look into this issue? Could it be something that may need to be handled differently in the code to mitigate this issue? Something not released correctly? Obviously sample_decode is built to run once, but looking into how to handle opening/closing decoding streams with MSDK and D3D11 we would hope that the sample would at least initiate and close everything correctly anyway.
Best Regards,
Carl
- Tags:
- Development Tools
- Graphics
- Intel® Media SDK
- Intel® Media Server Studio
- Media Processing
- Optimization
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Below is the output of Media System Analyzer:
Intel(R) Media Server Studio 2017 - System Analyzer (64-bit) The following versions of Media SDK API are supported by platform/driver [opportunistic detection of MSDK API > 1.20]: Version Target Supported Dec Enc 1.0 HW Yes X X 1.0 SW Yes X X 1.1 HW Yes X X 1.1 SW Yes X X 1.2 HW Yes X X 1.2 SW Yes X X 1.3 HW Yes X X 1.3 SW Yes X X 1.4 HW Yes X X 1.4 SW Yes X X 1.5 HW Yes X X 1.5 SW Yes X X 1.6 HW Yes X X 1.6 SW Yes X X 1.7 HW Yes X X 1.7 SW Yes X X 1.8 HW Yes X X 1.8 SW Yes X X 1.9 HW Yes X X 1.9 SW Yes X X 1.10 HW Yes X X 1.10 SW Yes X X 1.11 HW Yes X X 1.11 SW Yes X X 1.12 HW Yes X X 1.12 SW Yes X X 1.13 HW Yes X X 1.13 SW Yes X X 1.14 HW Yes X X 1.14 SW Yes X X 1.15 HW Yes X X 1.15 SW Yes X X 1.16 HW Yes X X 1.16 SW Yes X X 1.17 HW Yes X X 1.17 SW Yes X X 1.18 HW Yes X X 1.18 SW Yes X X 1.19 HW Yes X X 1.19 SW Yes X X 1.20 HW Yes X X 1.20 SW Yes X X 1.21 HW Yes X X 1.21 SW Yes X X 1.22 HW Yes X X 1.22 SW Yes X X 1.23 HW Yes X X 1.23 SW Yes X X 1.24 HW Yes X X 1.24 SW Yes X X 1.25 HW Yes X X 1.25 SW Yes X X 1.26 HW Yes X X 1.26 SW Yes X X 1.27 HW Yes X X 1.27 SW Yes X X Graphics Devices: Name Version State Intel(R) HD Graphics 530 25.20.100.6444 Running / Full Power NVIDIA GeForce GTX 970 25.21.14.1616 Running / Full Power System info: CPU: Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz OS: Microsoft Windows 10 Pro Arch: 64-bit Installed Media SDK packages (be patient...processing takes some time): Intel(R) Media SDK 2018 R2 - HEVC GPU accelerated Encoder Intel(R) Media SDK 2018 R2 - Media Samples Intel(R) Media Server Studio 2017 - Video Quality Caliper Intel(R) Media SDK 2018 R2 - Software Development Kit Intel(R) Media SDK 2018 R2 - Documentation for HEVC Intel(R) Media SDK 2018 R2 - HEVC SW Encoder Samples for Intel(R) Media SDK 2017 for Windows* Intel(R) Media SDK 2018 R2 - HEVC SW Decoder Installed Media SDK DirectShow filters: Installed Intel Media Foundation Transforms: Intel(R) Hardware M-JPEG Decoder MFT : {00C69F81-0524-48C0-A353-4DD9D54F9A6E}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Has anybody looked at this at all?
I can mention that our software is primarily used in industrial environments, where stability is key. Because of this issue we are now considering using other hardware decoding solutions than Intel, which would be unfortunate since it otherwise seems promising.
If any more information is needed, I'll be happy to provide it.
Best Regards,
Carl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Carl,
Sorry for the late response, I have looked at your description and I can reproduce the issue.
This is the memory management issue which app or library didn't clean up the GPU memory for each run, I didn't see the memory increase during each run, it only happens between stop and start running.
I have submitted an investigation request to dev team and will keep you updated.
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mark,
Ok, good. Correct, it seems that when closing a decoding session, not all memory is released correctly, so after too many video stream switches this causes our application to crash. (Our application is switching video streams by command of operators or automatically by )
Best Regards,
Carl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Any news with this issue?
Best Regards,
Carl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
At some point, sample_decode has been updated to use CComPtr instead of raw pointers (which is a good thing). Instead of using the operator* to dereference the COM pointer, the code dereference the pointer directly (preventing the leak detection to work).
If you replace the line in sample_common/d3d11_device.cpp, line 262:
hres = m_pSwapChain->GetBuffer(0, __uuidof( ID3D11Texture2D ), (void**)&m_pDXGIBackBuffer.p);
with
hres = m_pSwapChain->GetBuffer(0, __uuidof( ID3D11Texture2D ), (void**)&m_pDXGIBackBuffer);
You get an assert (in debug) indicating that the COM pointer was leaked.
Pascal
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi! Thanks for the response, but I'm not sure I follow all the way.
Do you mean that the texture (m_pDXGIBackBuffer) is leaking? If so, wouldn't it be leaked either once (per running session), or once per frame? Neither case seems to be true as far as I understand, as the leaked memory is too large for once, and too small for once per frame.
We're not using CComPtr in our code (legacy reasons where we explicitly want to allocate/deallocate), so I'm not too familiar with CComPtr. What would you suggest as the solution for the sample in this case? (If you have time to answer)
Best Regards,
Carl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Carl,
I'm not too sure about the size of the leak since there are a some resources which are leaked per frame. Also I did not try to understand how you measured the memory leak. I just remembered from the original sample, there was a few COM leaks around and apparently they are still here. To detect the leak, in d3d11_device.cpp, replace all the "&pointer.p" with "&pointer" and to fix it, place appropriate pointer.Release() call before dereferencing the pointer. I've tried your sample and I've had to do it in 3 places (m_pInputViewLeft, m_pOutputView and m_pDXGIBackBuffer.)
Cheers,
Pascal
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Carl,
As mentioned in your original message, I've used Process Explorer to confirm that the memory leak disappears while rendering after fixing the COM leaks.
Pascal
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Carl,
Sorry for the late response, I have submitted the issue and the dev team is investigating it.
I just check the status and looks like they don't have progress yet. I will try to push.
About Pascal's suggestion, it seems like a work around but not direct hit the bug, would it be a clue to investigate your problem?
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Pascal & Mark,
I'll look into this. I hope this would be a clue and provide a solution, but I'm not fully convinced yet since I feel the problem would be either more or less memory leaked than what is actually leaked. And in our code we're not using CComPtr, and using the D3D11 Debug Layer we can not see any leaked D3D resources (using ReportLiveDeviceObjects), although maybe we still could have a similar problem like Pascal mentions which the Debug Layer can not pick up? Anyway, I hope this could provide a clue and answer!
I'll get back when I've been able to look into it!
Best Regards,
Carl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I can confirm this seems to solve the issue in the sample! So we can hopefully do some digging and find out why our own code (based on another/older sample) behaves like it does.
I'll update as soon as I have more information.
Best Regards,
Carl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Pascal, thanks for this. Can you confirm what the mitigating action is and where, precisely you have fixed it? I would like to put this fix into my own codebase too. I'm guessing checking and releasing the pointer before use is the thing to do.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Carl,
If you want my help, you can post how did you fix it. I can do a history check at least, I could also tell the dev team to speed up the investigation.
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Now we've had time to look at this again.
For the sample, I did exactly what Pascal said here:
"To detect the leak, in d3d11_device.cpp, replace all the "&pointer.p" with "&pointer" and to fix it, place appropriate pointer.Release() call before dereferencing the pointer. I've tried your sample and I've had to do it in 3 places (m_pInputViewLeft, m_pOutputView and m_pDXGIBackBuffer.) "
If this is the "correct" way to solve the sample I'm not sure (regarding the usage of CComPtr), but for us it was enough to prove that the problem was in this sample, and it gave us clues as to why our codebase was having problems (based on another older sample). We've also solved our problems, so for me this issue is cleared.
But it should be fixed in the sample for next release I'd think.
Best Regards,
Carl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks so much,
This is a great help!
Let me tell this to dev team and make sure it is fixed for the next release.
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Carl L. wrote:
<cut>...and it gave us clues as to why our codebase was having problems (based on another older sample). We've also solved our problems, so for me this issue is cleared.
Hi Carl, I'm facing a very similar issue, but my solution architecture is completely different compared to the sample provided, please take a look at my thread if you can.
Can you explain me how you was able to solve your specific issue?
I hope it will let me "turn on a light" on mine.
Thank you,
Fabio
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Fabio and Carl,
BTW, do you see the issue with samples from GitHub https://github.com/Intel-Media-SDK/MediaSDK/tree/master/samples ?
Regards,
Dmitry
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dmitry, it's very difficult for me to recreate the same condition with your sample: I'm decoding 40 live streams inside a C# project, then stop and restart them on a timer.
I've written a simple C++ class that wraps mfx calls, can you help me debugging it?
This is my FrameAllocator alloc callback:
MfxHelper& self = *(MfxHelper*)pthis; self.RealSurfaceNumber = (request->NumFrameSuggested + self.mfx_video_params->AsyncDepth); self.mfx_surfaces = (mfxFrameSurface1**)calloc(self.RealSurfaceNumber, sizeof(mfxFrameSurface1*)); for (int i = 0; i < self.RealSurfaceNumber; i++) { self.mfx_surfaces = (mfxFrameSurface1*)calloc(1, sizeof(mfxFrameSurface1)); self.mfx_surfaces->Info = self.mfx_video_params->mfx.FrameInfo; self.mfx_surfaces->Data.MemId = (mfxMemId)(i + 1); self.outer_mids.push_back(self.mfx_surfaces->Data.MemId); } response->mids = &self.outer_mids.front(); response->NumFrameActual = self.RealSurfaceNumber; D3D11_TEXTURE2D_DESC desc = {}; desc.Width = self.mfx_video_params->mfx.FrameInfo.Width; desc.Height = self.mfx_video_params->mfx.FrameInfo.Height; desc.MipLevels = 1; desc.ArraySize = 1; desc.Format = DXGI_FORMAT_NV12; desc.SampleDesc.Count = 1; desc.SampleDesc.Quality = 0; desc.Usage = D3D11_USAGE_DEFAULT; desc.BindFlags = D3D11_BIND_DECODER | D3D11_BIND_SHADER_RESOURCE; self.textures = (ID3D11Texture2D**)calloc(self.RealSurfaceNumber, sizeof(ID3D11Texture2D*)); HRESULT hr; for (int counter = 0; counter < self.RealSurfaceNumber; counter++) { ID3D11Texture2D *texture; hr = self.d3d11Device->CreateTexture2D(&desc, NULL, &texture); self.textures[counter] = texture; } return MFX_ERR_NONE;
Then, in the GetHDL callback:
MfxHelper& self = *(MfxHelper*)pthis; mfxHDL d3d_handle = self.textures[(int)mid]; mfxHDLPair *pPair = (mfxHDLPair*)handle; pPair->first = d3d_handle; pPair->second = (mfxHDL)(UINT_PTR)0; return MFX_ERR_NONE;
This is the DecodeAsync (called from C#)
mfxFrameSurface1* pWorkSurface = this->findFreeSurface(); if (&pWorkSurface != NULL) { mfxFrameSurface1* pOutSurface = NULL; mfxSyncPoint* sync = (mfxSyncPoint*)calloc(1, sizeof(mfxSyncPoint)); this->lock_object.lock(); if (this->disposing == false) { this->LastError = MFXVideoDECODE_DecodeFrameAsync(this->mfx_session, this->mfx_bitstream, pWorkSurface, &pOutSurface, sync); if (this->LastError == MFX_ERR_NONE) this->LastError = MFXVideoCORE_SyncOperation(this->mfx_session, *sync, 5000); if (this->LastError == MFX_ERR_NONE) this->onFrameReady(this->textures[(int)pOutSurface->Data.MemId]); } this->lock_object.unlock(); free(sync); }
And at the end the finalizer:
if (this->mfx_frame_allocator != NULL) { do { mfxFrameSurface1* pWorkSurface = this->findFreeSurface(); if (&pWorkSurface != NULL) { mfxFrameSurface1* pOutSurface = NULL; mfxSyncPoint* sync = (mfxSyncPoint*)calloc(1, sizeof(mfxSyncPoint)); this->LastError = MFXVideoDECODE_DecodeFrameAsync(this->mfx_session, NULL, pWorkSurface, &pOutSurface, sync); free(sync); } } while (this->LastError != MFX_ERR_MORE_DATA); } this->LastError = MFXVideoDECODE_Close(this->mfx_session); for (int i = 0; i < this->RealSurfaceNumber; i++) free(this->mfx_surfaces); free(this->mfx_surfaces); std::vector<mfxMemId>().swap(this->outer_mids); this->outer_mids.clear(); this->outer_mids.shrink_to_fit(); if (this->mfx_frame_allocator != NULL) { this->mfx_frame_allocator->pthis = NULL; this->mfx_frame_allocator->Alloc = NULL; this->mfx_frame_allocator->Free = NULL; this->mfx_frame_allocator->GetHDL = NULL; this->mfx_frame_allocator->Lock = NULL; this->mfx_frame_allocator->Unlock = NULL; free(this->mfx_frame_allocator); } this->onFrameReady = NULL; delete[] this->mfx_bitstream->Data; free(this->mfx_bitstream); free(this->mfx_video_params); free(this->mfx_version); free(this->mfx_implementation); free(this->mfx_init_params); for (int i = 0; i < this->RealSurfaceNumber; i++) this->textures->Release(); free(this->textures); this->LastError = MFXClose(this->mfx_session);
Do you spot something wrong?
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Carl,
Sorry for the late response,
We have confirmed the issue has been solved with our latest code and sample.
We tested the 4 commands as you post at the beginning with the following conditions:
- driver 25.20.100.6444,
- similar machine in GTA (i5-6600, HD 530), i
- MediaSDK 2019 R1
- sample_decode from [https://github.com/Intel-Media-SDK/samples
Let me know if you have any questions.
Mark Liu
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page