Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

Intel graphic HD 500/510 dxva2 decoding

dumonteil__david
Beginner
4,385 Views

Hello.

I wrote a C++ software to do h264 hardware video decoding, using DXVA2.

Source code here :

H264Dxva2Decoder

This program works very well with  NVIDIA cards and Intel HD Graphics 4000 (windows7). I think this program respects correctly h264 specification from DXVA2_ModeH264_E device guid.

Unfortunately, with Intel graphic HD 500 and 510, decoding is not correct, for both windows7 and windows10. Drivers are updated.

If you run my program, you will see that there is no error, but the decoded video frame are not correct with Intel graphic HD 500/510. But it is ok with Intel HD Graphics 4000.

Can you help me to understand why Intel graphic HD 500/510 does not decode correctly, using dxva2.

 

0 Kudos
1 Solution
Mark_L_Intel1
Moderator
4,366 Views

I just contact with dev team and  confirm the followings:

"For MSDK decoders, it’s doesn’t matter if stream has 3 or 4 byte start codes. When MSDK sends slices to the driver, MSDK always passes 3 byte start code to the driver whatever original slice in bitstream had."

So it seems confirmed what I said, DXVA doesn't go to MSDK but runtime(user mode driver), from product point of view, this is clear out of the Media SDK scope, but from user experience point of view, we should consider this situation.

I will send your suggestion to dev team.

Mark

View solution in original post

0 Kudos
23 Replies
dumonteil__david
Beginner
3,943 Views

I found the problem.

Nalu buffer with start code 0x00, 0x00, 0x00, 0x01 are not handled.

Start code 0x00, 0x00, 0x01 is ok.

0 Kudos
Mark_L_Intel1
Moderator
3,943 Views

Hi David,

I am quite sure the hardware decoder should handle both 3 and 4 bytes startcodes.

But if you use DXVA2, I am not sure where the problem happened. It should be better if you can use the same video stream to run sample_decode.exe in our sample code.

You can go to Media SDK landing page and install Media SDK to try the sample code.

https://software.intel.com/en-us/media-sdk

Mark

0 Kudos
dumonteil__david
Beginner
3,943 Views

Hello.

In fact, Intel HD Graphics 4000 and Intel graphic HD 500 have this problem, using 4 bytes start code. If i use 3 bytes start code, it's ok, with shades. I tried with sample_decode.exe.

I use this sample movie : Movie sample (there is different resolution)

My program :

-> 1280*720 with 3 bytes start code = ok

-> 1920*1080 with 3 bytes start code = artifacts on top of the movie (first few lines)

-> 4 bytes start code = bad for all resolution

Intel media SDK :

-> 1280*720 with 3/4 bytes start code = ok

- -> 1920*1080 with 3/4 bytes start code = at the beginning (50 frames ~), video is green, after it's ok.

Check the movie file with 1920*1080 resolution, and let me know if you have green frame like me, with Intel media SDK.

Hope we find a solution.

 

 

 

 

0 Kudos
dumonteil__david
Beginner
3,943 Views

Hello.

I proposed to add movie file parser on Intel-Media-SDK issues

-> no answer... 

I proposed pull request on Intel-Media-SDK pull

-> no answer : closed (good news)

Here

-> no answer : see below

So, what should i think about it.

Intel, you want to make your product great, or not.

if so, tell me, I will not waste my time.

I receive advertising like :

Discover with Intel: Transformation solutions for any industry. "Access to business information can enable data-driven decisions and create opportunity. We are here to help you manage today’s critical workloads-and set you up for the future. Your digital transformation starts now."

So.

0 Kudos
Mark_L_Intel1
Moderator
3,943 Views

Hi David,

Sorry for the late response, there was a time when I was busying on other project and your post probably pushed down to the bottom when I came back.

I am going to try the clip you linked in our sample_decode.exe, did you change the stream when you tried? If you did, you can just upload here.

The other thing I was wondering is the hardware, if the issue happened only on a particular hardware, then it might take time to fix since they are old hardware.

Could you clarify the 2 questions?

I will not wait for your answer, I will use the default video from your link and try it on a SkyLake machine with Media SDK 2018R2 in Windows 10.

Mark

0 Kudos
dumonteil__david
Beginner
3,943 Views

Hello.

thank you for your reply. (How can i delete my previous comment ?).

- About the stream, the only thing i changed is that i replaced the four bytes nalu size with start code. That's all.

- The Intel HD Graphics 4000 and Intel graphic HD 500 behave the same. For Intel graphic HD 500, all is up to date : I bought the laptop this year, and I choose Intel GPU to test my program.

I can update my program to generate h264 video stream, so you can check, and if you are not sure my video stream is correct. But like I said, it is perfect with Nvidia GPU for all start code.

About the video sample : the video stream with 1280*720 resolution and 3 bytes start code displays well for both my program and latest Intel media SDK.

0 Kudos
Mark_L_Intel1
Moderator
3,943 Views

No worries about your comment, this shows the true situation of our support and I will think it as a valuable feedback.

About my goal, I might not explain clearly, I want to use the stream file you used and run it with our sample code, could you posted it here? 

I think I have explain the reason why I want to check the bitstream in our sample first, there might be issue related to DXVA framework, I want to isolate any platform issues related to DXVA, driver or OS.

Sorry I still don't run your original stream since our server block the website because of the security, but I will find a way to download it and try it later today.

Mark

0 Kudos
dumonteil__david
Beginner
3,943 Views

Hello.

Here are the files. One with 3 bytes start code, and the other with 4 bytes start code.

For now, i just checked them with Media Player Classic-HC, and they play well. I will retry on my Intel Laptop tomorrow.

I will post more info about the laptop configuration, and some screenshots about the problem, if needed.

Here is the project i use to generate raw h264 video stream : CheckMp4Files

Thanks for your interest.

0 Kudos
Mark_L_Intel1
Moderator
3,943 Views

Great!

Let me try it today and see if I can reproduce it. My processor has HD530 and I hope I can reproduce it with our sample code.

Mark

0 Kudos
Mark_L_Intel1
Moderator
3,943 Views

Sorry for the late response,

I have done my test with sample_decode.exe but didn't find any issue. Could you also check it on your platform with the same command line? My concern is, the issue might related to some hardware and driver combination.

If you already tested with sample_decode.exe, please let me know your driver version and GPU version. I will submit a bug.

Here is my system configuration:

Processor Core i5-6440HQ @2.60GHz

GPU: HD Graphics 530

OS: Windows 10 Enterprise Ver 1709

Driver: 25.20.100.6444

Here are the logs:

~\Intel® Media SDK 2018 R2 - Media Samples 8.4.27.378\_bin\x64>sample_decode.exe h264 -i Beach10890_1280-720-3bytes.h264
Decoding Sample Version 8.4.27.378


Input video     AVC
Output format   NV12
Input:
  Resolution    1280x720
  Crop X,Y,W,H  0,0,1280,720
Output:
  Resolution    1280x720
Frame rate      29.97
Memory type             system
MediaSDK impl           hw
MediaSDK version        1.27

Decoding started
Frame number: 1048, fps: 513.310, fread_fps: 0.000, fwrite_fps: 0.000
Decoding finished

~\Intel® Media SDK 2018 R2 - Media Samples 8.4.27.378\_bin\x64>sample_decode.exe h264 -i Beach10890_1280-720-4bytes.h264
Decoding Sample Version 8.4.27.378


Input video     AVC
Output format   NV12
Input:
  Resolution    1280x720
  Crop X,Y,W,H  0,0,1280,720
Output:
  Resolution    1280x720
Frame rate      29.97
Memory type             system
MediaSDK impl           hw
MediaSDK version        1.27

Decoding started
Frame number: 1048, fps: 683.747, fread_fps: 0.000, fwrite_fps: 0.000
Decoding finished

Mark

0 Kudos
dumonteil__david
Beginner
3,943 Views

Ok.

So, you agree, both video streams are correct.

Why with dxva2 and Intel, 3 bytes start code is ok, while 4 bytes start code is not ok :

dxva2.png

Configuration :

Processor: Intel(R) Celeron(R) CPU N3350 @ 1.10GHz (2CPUs)

GPU: HD Graphics 510

OS: Windows 10 Family 64bits Ver 1809

Driver: 26.20.100.6709

0 Kudos
Mark_L_Intel1
Moderator
3,943 Views

Are you using sample_decode.exe?

For Windows, if you are using MFT, the software path might be different. Rather than through Media SDK layer, it might go directly to user mode driver(media runtime which is distributed via driver package). The only one thing could make this happens, the parsing is run under driver layer.

I also double check sample_decode.exe with rendering and didn't reproduce as your screen capture.

I happen to have a Liva Z with N3350, let me try it in Windows 10 1809 with sample_decode.exe.

Mark

0 Kudos
Mark_L_Intel1
Moderator
3,943 Views

I tried on Liva Z, and still no issue.

I found the processor on this device is N4200;

Windows Pro 1809.

Driver was 25.20.100.6373 by default and changed to 26.20.100.6709 after update with DSA.

So I actually tested 2 drivers, I believe 6709 is DCH driver.

Let me check where bitstream parser should be, it seems like not in hardware, this is why DXVA could take some role.

Mark

0 Kudos
dumonteil__david
Beginner
3,943 Views

Thank you.

Yes, sample_decode.exe has no problem to play those streams (3 or 4 bytes start code).


I don't use MFT, i use Dxva2. Did not you see the source code that I provided ? H264Dxva2Decoder

I use DXVA2_ModeH264_E device guid, and DXVA_Slice_H264_Short (ConfigBitstreamRaw == 2).


I found this documentation : intel-graphics-developers-guides

Under "Open Source Graphics Programmers Reference Manual", I downloaded "Volume 8: Media VDBOX" (intel_os_gfx_prm_vol8_-_media_vdbox_0.pdf) :

 

DxvaShort.png

 

This document only talk about 3 bytes start code, not 4 bytes start code... So it seems Intel's Dxva2 implementation does not handle 4 bytes start code, according to this document.

Now i think, I just have to change the BSNALunitDataLocation from the DXVA_Slice_H264_Short structure, when GPU is Intel.

 

I just do this program for fun, but it would be nice if on Windows, we could have direct access to Intel Gpu decoder.

 

 

0 Kudos
dumonteil__david
Beginner
3,943 Views

Ok. I think I understand now.

From "DirectX Video Acceleration Specification for H.264/AVC Decoding" :

BSNALunitDataLocation
If wBadSliceChopping is 0 or 1, this member locates the NAL unit with
nal_unit_type equal to 1, 2, or 5 for the current slice. The value is the byte offset,
from the start of the bitstream data buffer, of the first byte of the start code prefix in
the byte stream NAL unit that contains the NAL unit with nal_unit_type equal to 1, 2,
or 5. (The start code prefix is the start_code_prefix_one_3bytes syntax element. The
byte stream NAL unit syntax is defined in Annex B of the H.264/AVC specification.
The current slice is the slice associated with this slice control data structure.)

I'm sorry, I misread the specifications.

It seems that Nvidia is more permissive with this.

Thank you for your time.

 

0 Kudos
Mark_L_Intel1
Moderator
3,943 Views

Thanks for the quick response,

I think the manual has a confusion of start code but it should be basically right. Media SDK should handle 3 and 4 bytes without problem. I think we have confirmed this by testing. The problem you got should because DXVA might access to media runtime directly and didn't go through Media SDK.

Based on my understanding but I might be wrong, the manual only specifies start code prefix as 3 bytes long:

Start Code Prefix

Here is Annex B Byte stream NAL unit syntax:

Byte Stream NAL unit syntax

  • First of all, NAL unit start code prefix is always 3 byte length (start_code_prefix_one_3bytes).
  • Depending on NAL unit type and position additional one 0x00 “zero_byte” may be appended before “start_code_prefix_one_3bytes”. Usually this pair (zero_byte + start_code_prefix_one_3bytes) is called ”4-byte start code”. 
  • Standard clearly defines cases when “4-byte start code” is used: for SPS, PPS, and for first NAL unit of coded picture (latter is regardless of NAL unit type). For these cases MSDK uses ”4-byte start code” as specification requires. In other cases start code is always 3-byte.

Mark

0 Kudos
Mark_L_Intel1
Moderator
3,943 Views

I just saw your new post after I sent mine.

You are welcome, this discussion are very good and I like it.

So your solution should be: detect the NALU type and set BSNALunitDataLocation bit for Intel platform?

Any suggestion to our product?

Mark

0 Kudos
dumonteil__david
Beginner
3,943 Views

Now that I understand what's going on, at first, I would say I should respect Dxva2 specifications.

So, doing this, I don't really need to know, what GPU is in used. My first tests seem conclusive. I play with BSNALunitDataLocation.

But...

I always think that start code is a bad thing, and I will not speak about emulation prevention byte, the trick to fill the deficiencies.

With all these things, "start code/emulation prevention byte/nalu size (1/2 or 4)", the decoding process requires additional buffer management, that are not necessary, in my opinion.

So, the problem is with people who are doing the specifications. I think they forget that the decoding process must be quick, - very quick. One person encodes the file, millions play it. Because of their specification, we need to do some code contortion, who are counterproductive, and unecessary, again in my opinion.

My only suggestion : if Intel is in the mpeg/h264/h265 consortium, tells them to stop using stupid start code. Just use nalu size, and with 4 bytes only, or 8 bytes if they think that in the future, it will be needed...

 

 

 

0 Kudos
Mark_L_Intel1
Moderator
4,367 Views

I just contact with dev team and  confirm the followings:

"For MSDK decoders, it’s doesn’t matter if stream has 3 or 4 byte start codes. When MSDK sends slices to the driver, MSDK always passes 3 byte start code to the driver whatever original slice in bitstream had."

So it seems confirmed what I said, DXVA doesn't go to MSDK but runtime(user mode driver), from product point of view, this is clear out of the Media SDK scope, but from user experience point of view, we should consider this situation.

I will send your suggestion to dev team.

Mark

0 Kudos
dumonteil__david
Beginner
3,494 Views

Thank you for the details.

If one day, you provide direct access to Intel GPU decoder on Windows, i will play with it.

 

0 Kudos
Reply