I have both Coffee Lake and Skylake boxes here and managed to get both H264 encode/decode and H265 encode/decode working. The H265 was MFX_PROFILE_HEVC_MAIN using MFX_FOURCC_NV12, i.e. 8 bit only. Apparently you don't need to manually load a plugin to get it to work on these platforms.
Anyway I've now moved onto trying to get MFX_PROFILE_HEVC_MAIN10 with MFX_FOURCC_P010 HEVC working. I think things are somewhat different here, i.e. I need to manually load a plugin first. I'm not 100% when and where to do this however. Right now I'm loading the plugin as follows:
auto status = MFXVideoUSER_Load(*My_QSV_Session, &MFX_PLUGINID_HEVCE_HW, 1);
, if and only if the caller asks for HEVC Main10. OK, I can choose one of the following plugins:
Some of them don't exist, i.e. aren't in my Intel(R) Media SDK 2018 R2 Plugin folder. So my first question is where do I find them all? Secondly neither sample_encode nor my own project can successfully initialise against HEVC Main10. Is the above MFXVideoUSER_Load statement all I have to do to load a plugin? If I want the fastest hardware encode (we control customer CPU platform quite tightly, i.e. no need for CUDA-style encoding acceleration), which of the above options should I choose?
My command line with sample_encode is:
sample_encode h265 -i "..\..\gops\in_p010\1.raw" -o output.265 -w 1280 -h 720 -p010 -hw -d3d11 -CodecProfile 2
The raw frame is p010 format (verified as such) and of the correct size. I'd like -hw (hardware) and preferably -d3d11, with the Main10 profile which I believe is "2". sample_encode fails with MFX_ERR_MEMORY_ALLOC in MFXVideoENCODE::Init.
I get incompatible params from Query with -d3d11, so I reverted it to use the SysMemAllocator. Using that I get Unsupported when I come to Init my encoder. The parameters in my project were:
...mfx.CodecId = MFX_CODEC_HEVC;
...mfx.CodecProfile = MFX_PROFILE_HEVC_MAIN10;
...mfx.FrameInfo.FourCC = MFX_FOURCC_P010;
These match the ones sample_encode submits (it fails too).
What mistake have I made?
I managed to get sample_encode working by removing the -d3d11 flag. In my own codebase, the error was calling SetFrameAllocator. If we're not using d3d11 (and it appears HEVC Main 10 does not support that method of fetching/feeding surfaces), we should not call the latter function.
Thanks so much,
Sorry for the late response, I was actually looking at it and try to reproduce it.
Really appreciated your detailed information and description. I want to reproduce it on my side so I could submit a bug, could you attach the 10bit raw video file you used? You could cut it short to fit the size of this website as long as it can reproduce the issue.
Thanks for your reply. I'm using a single frame, thinking that if I get as far as encoder->Init and then encoder->EncodeFrameAsync, I can count the experiment a success (that is to say, I can then analyse the parameters that worked for sample_encode and see how they differ from the parameters I myself used that failed). My codebase will take all frames from a given directory and encode them of course.
Anyway, the frame is attached (unzip contains 1.raw). It's a P010 format raw image. I verified it with the 7yuv tool (1280x720 420 P10). I extracted it from the sample 10 bit movie (jellyfish-20-mbps-hd-hevc-10bit.mkv) here. I resized it with ffmpeg as follows:
ffmpeg -i "jellyfish-20-mbps-hd-hevc-10bit.mkv" -c:v libx265 -vf scale=1280x720:flags=bicubic "jellyfish 1280x720.mkv"
and then extracted all the frames as follows:
ffmpeg -i "jellyfish 1280x720.mkv" -vframes 16000 -pix_fmt p010le ..\..\gops\in_p010\%d.raw
It wouldn't surprise me if sample_encode just doesn't like the fact it's a single frame, i.e. that the stream runs dry before it gets a chance to do an encode. It was rather curious that it didn't like d3d11 surfaces though.
Thank you for your assistance.
By the way, in my usual dumb way I finally twigged how the plugins are named. MFX_PLUGINID_HEVCD_HW is CD, which is a decoder (D) and MFX_PLUGINID_HEVCE_HW is CE, which is an encoder (E).
Let me go over this one more time, with a more coherent replication strategy (no resizing and so forth and not using any of my own code):
First I downloaded a 10 bit 1920 x 1080 video file from here:
Then I used ffmpeg (a recent build with libx265) to extract 10 bit p010le raw stream from it:
ffmpeg -i "jellyfish-40-mbps-hd-hevc-10bit.mkv" -pix_fmt p010le -f rawvideo "in.raw"
Next I use sample_encode as follows, successfully producing what I think is a 10 bit H265 output file:
sample_encode h265 -i "in.raw" -o "output.265" -w 1920 -h 1080 -p010 -hw -CodecProfile 2
Finally I used sample_decode as follows:
sample_decode -r h265 -hw -p010 -d3d11 -p 33a61c0b4c27454ca8d85dde757c6f8e -i "output.265"
sample_decode throws an exception (access violation) in CDecodingPipeline::SyncOutputSurface at m_pDeliveredEvent->Reset(). The weird thing is I cannot find anywhere in the code of sample_decode that m_pDeliveredEvent is ever given a value so I'm not really surprised by this. m_pDeliverOutputSemaphore is also null, which implies even if m_pDeliveredEvent has a value, we'll fail on the next statement (->Post()).
If I don't supply -d3d11 or -hw to sample_decode, the allocator fails at CreateHWDevice(); If I supply either -d3d11 or -d3d9, it fails with the above exception.
I feel I'm missing something really important here but I'm not sure what. My ultimate goal is to encode and decode H265 Main 10 frames.
So I gave up trying to render the scene and just wrote the decoded output. It should be a raw p010 file, i.e. :
sample_decode h265 -hw -i output.265 -o test.out
With test.out, which was over 5gb (about right), I ran ffplay to display it. So to recap, I have done the following:
(1) Download a video claiming to be H265 Main 10
(2) Extracted the raw video with a recent build of ffmpeg into a (large) file, using flat -p010le at 1920x1080
(3) Encoded the raw video with sample_encode using the flags h265 -i "in.raw" -o "output.265" -w 1920 -h 1080 -p010 -hw -CodecProfile 2
(4) Decoded the encoded file from (3) using sample_decode h265 -hw -i output.265 -o test.out
(5) Played test.out with ffplay, which gives me a green screen (there appears to be some coherent motion but it's very green and very, very dim).
Some help would be appreciated.
I have solved the encoding issue. Even though after query FrameInfo.Shift is 0, it appears I have to shift the data >> 6 (both the luma and the chroma) before I hand it to the encoder, i.e. the standard p010le format isn't what's expected. The encoder itself was fine (I swapped out the frame loading code for code that just made a fixed checker pattern in memory and the checker pattern was rock solid). VLC can read the encoded files no problem which is a definite milestone.
I will continue to investigate the decoding. Now I've solved the encoding, this will probably be simpler to understand.
I got decoding of H265 Main 10 working. Interestingly it's OK to use a d3d11 allocator with the decoder and very few (i.e. no) other changes were needed to the core decoding loop. No shifting of output data needed either. I also didn't have to manually load a plugin for either encode or decode (once I'd fixed my other problems), so that was a red herring. I'm now going to see if I can get a software decode version of Main 10 working (with plugin). I'm pessimistic about performance here.
Sorry for posting this stream of consciousness but I hate threads that ask questions where the poster never returns with his conclusions!
Sorry for the late response, I am on a trip this week.
Glad to hear that you solved it. Yes, our codec works on 10bit but the buffer management has to do some conversion and change the bit alignment from 8bit when inputting the file with 10bit color
I should point you out to our tutorial code which has the simplified sample code on 10bit.
You actually had a good point of the user experience, since our sample code is used for demo purpose, to use it as a sample code faces hurdles, the error message are also not easy to figure out the root cause. We are discussing to address this issues in the future.
Thanks so much for all the feedback.
Thanks Liu. Those tutorials are actually really useful. For future reference, there are still quite a few oddities that you won't read about anywhere and can only discover by attempting them. One was the fact you cannot encode Main 10 with a D3D 9/11 allocator and the other was that if you're not using a D3D 9/11 allocator you should not call SetFrameAllocator otherwise it'll fail. I don't know if these are bugs or inconsistencies in the API.
I noticed another kind-of oddity in sample_encode too. In the AllocFrames function of pipeline_encode.cpp, for surfaces that aren't "external" (are system memory, not D3D, presumably) the programmer locks the surface to "get YUV pointers". The surface isn't unlocked. This is presumably because Lock/Unlock is a NOP for system allocated surface in this context and no flags are set. It doesn't look right but appears to work. Some knowledge of API internals is needed to see this, which is kind-of bad.
Anyway what I wouldn't give for a hardware 10 bit H264 impl :p.
All the best.
Thanks so much,
These are all good feedbacks and shows the full path of your learning curve, we have to improve and I will send this to dev team.
For allocator issue in sample_encode.exe, that call should be avoided. Since Media SDK was focusing on codecs, we didn't put a full effort on memory allocation and rendering handle because we traded them as the user issue, but we really have to re-consider this.
For lock/unlock, this is a basically application level locking mechanism and definitely not the hardware level lock.