Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Novice
193 Views

HEVC per-LCU QP Control performance

Hi,

Question about per LCU QP control. I was wondering what kind of performance should I expect from the per-LCU QP control on HEVC?

I tried comparing H264 with and without EnableMBQP and I got the following throughput in 1280x720: 339 fps without EnableMBQP and 595 fps with EnableMBQP. We observe here a gain in throughput by activating EnableMBQP.

On the other hand, in H265, we have: 180 fps without EnableMBQP and it drops to 71 fps when activating EnableMBQP.

In short, is it normal that we are observing a significant performance drop in HEVC when activating  EnableMBQP while in H264 we gain throughput?

 

Environment information
OS: Ubuntu 20.04
CPU: Intel Core i9-9900K @ 3.60 GHz (Coffee Lake)
SDK and drivers:
- Intel Media SDK Version 20.2.1 (also tested on 20.4.pre)
- Media Driver 20.2.0
- Gmmlib 20.2.2
- libva 2.8.0
- libva-utils 2.8.0

 

Thanks

Labels (1)
0 Kudos
6 Replies
Highlighted
Novice
174 Views

It seems that the following functions take a significant amount of time to execute during encoding when we enable "EnableMBQP" (~ x10 longer execution time):

DDI_VA::SubmitTask

TaskManager::SubmitTask

VAPacker::SubmitTask

 

0 Kudos
Highlighted
Moderator
156 Views

Thanks Trans,


This is very important and I want to report to dev team.


Do you have the measurement number and the application you used?


I also want to reproduce it if you can give me the command line.


Mark


0 Kudos
Highlighted
Novice
150 Views

Hi Mark,

We tried with two different codes: 1) modified sample_encode and 2) sample_multi_transcode

1) Modified sample_encode

We took the function which configure the parameters and fills the MB/CU QP value from the sample_multi_transcode (CTranscodingPipeline::SetEncCtrlRT and CTranscodingPipeline::FillMBQPBuffer) and imported it to sample_encode. Then, we run the following commands:

HD HEVC encoding:

sample_encode h265 -hw -i input.yuv -o output.265 -f 60 -h 1920 -w 1080 -cqp -mbqp

Results:

libva info: VA-API version 1.8.0
libva info: User environment variable requested driver 'iHD'
libva info: Trying to open /opt/intel/mediasdk/lib64//iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_8
libva info: va_openDriver() returns 0
Encoding Sample Version 20.2.0

Input file format YUV420
Output video HEVC
Source picture:
Resolution 1088x1920
Crop X,Y,W,H 0,0,1080,1920
Destination picture:
Resolution 1088x1920
Crop X,Y,W,H 0,0,1080,1920
Frame rate 60.00
QPI 26
QPP 28
QPB 30
Gop size 65535
Ref dist 8
Ref number 4
Idr Interval 0
Target usage balanced
Memory type system
Media SDK impl hw
Media SDK version 1.33

Processing started
Frame number: 207
Encoding fps: 47

 

H264 HD encoding:

sample_encode h264 -hw -i input.yuv -o output.264 -f 60 -h 1920 -w 1080 -cqp -mbqp

We obtained:

libva info: VA-API version 1.8.0
libva info: User environment variable requested driver 'iHD'
libva info: Trying to open /opt/intel/mediasdk/lib64//iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_8
libva info: va_openDriver() returns 0
Encoding Sample Version 20.2.0

Input file format YUV420
Output video AVC
Source picture:
Resolution 1088x1920
Crop X,Y,W,H 0,0,1080,1920
Destination picture:
Resolution 1088x1920
Crop X,Y,W,H 0,0,1080,1920
Frame rate 60.00
QPI 0
QPP 0
QPB 0
Gop size 256
Ref dist 4
Ref number 3
Idr Interval 0
Target usage balanced
Memory type system
Media SDK impl hw
Media SDK version 1.33

Processing started
Frame number: 207
Encoding fps: 319

Processing finished

 

In short, for a HD video, we have a throughput of 319 fps when encoding with H264 and it drops to 47 fps when using HEVC.

In comparison, we've add the frame QP control in the sample code and the H265 throughput is at about 115 fps.

Another strange observation happens when we run multiple encode simultaneously:

Capture.PNG

A single stream has reduced total throughput (at 47 fps). However, when running 2 or more parallel HEVC encoding process, the total throughput is about 120 fps.

So we are suspecting an issue in the LCU QP control when running a single stream.

 

2) sample_multi_transcode

We first ran a transcode from H264 to H265 with:

sample_multi_transcode -i::h264 test_1080.264 -o::h265 test_1080_trans.265 -hw -cqp -extmbqp

Results:

Multi Transcoding Sample Version 20.2.0

libva info: VA-API version 1.8.0
libva info: User environment variable requested driver 'iHD'
libva info: Trying to open /opt/intel/mediasdk/lib64//iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_8
libva info: va_openDriver() returns 0
Session 0:
[WARNING] Configuration changed on the Query() call
mfx.LowPower:0 changed to mfx.LowPower:32
ext.265P.PicHeightInLumaSamples:1080 changed to ext.265P.PicHeightInLumaSamples:1088
Pipeline surfaces number (DecPool): 12
MFX HARDWARE Session 0 API ver 1.33 parameters:
Input video: AVC
Output video: HEVC

Session 0 was NOT joined with other sessions

Transcoding started
..
Transcoding finished

Common transcoding time is 6.19848 sec
-------------------------------------------------------------------------------
*** session 0 [0x555555705cc8] PASSED (MFX_ERR_NONE) 6.19812 sec, 215 frames, 34.688 fps
-i::h264 test_1080.264 -o::h265 test_1080_trans.265 -hw -cqp -extmbqp

-------------------------------------------------------------------------------

The test PASSED

 

Then with transcode from H265 to H264

sample_multi_transcode -i::h265 test_1080.265 -o::h264 test_1080_trans.264 -hw -cqp -extmbqp

We obtained the following:

Multi Transcoding Sample Version 20.2.0

libva info: VA-API version 1.8.0
libva info: User environment variable requested driver 'iHD'
libva info: Trying to open /opt/intel/mediasdk/lib64//iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_8
libva info: va_openDriver() returns 0
Session 0:
Pipeline surfaces number (DecPool): 10
MFX HARDWARE Session 0 API ver 1.33 parameters:
Input video: HEVC
Output video: AVC

Session 0 was NOT joined with other sessions

Transcoding started
..
Transcoding finished

Common transcoding time is 1.67442 sec
-------------------------------------------------------------------------------
*** session 0 [0x555555705cc8] PASSED (MFX_ERR_NONE) 1.6741 sec, 211 frames, 126.038 fps
-i::h265 test_1080.265 -o::h264 test_1080_trans.264 -hw -cqp -extmbqp

-------------------------------------------------------------------------------

The test PASSED

 

We observe here a drop in performance when transcoding to H265. We have 126 fps and 35 fps for H264 and H265 respectively.

 

Thanks

0 Kudos
Highlighted
Moderator
122 Views

Hi Tran,


Sorry for the late response and thanks for the detailed report.


I have report this issue to dev team and I will updated you with the investigation result.


Mark


0 Kudos
Highlighted
Moderator
52 Views

Hi Tran,

I am working with developer for your request. They are using all sample_multi_transcode(SMT) to test the performance. Could you also do it? Since your original test modified sample_encode, this tests are more comparable to their test.

./sample_multi_transcode -i::h265 test.h265 -o::h264 out.h264 -cqp

./sample_multi_transcode -i::h265 test.h265 -o::h264 out.h264 -cqp -extmbqp

./sample_multi_transcode -i::h265 test.h265 -o::h265 out.h265 -cqp

./sample_multi_transcode -i::h265 test.h265 -o::h265 out.h265 -cqp -extmbqp

Mark

 

0 Kudos
Highlighted
Moderator
26 Views

Hi Tran,


I got a conclusion from developer and hope this would solve your question:

It turned out that the problem is that AVC sets the QP parameters to 0 by default, this provides a longer encoding than if we use "extmbqp" parameter, which sets a higher QP, and we get a higher encoding rate.

For HEVC default QP is 26.

Expected performance using "extmbqp" parameter depends on which QP values the MBQP sets.

If it sets higher QP values than we set ourselves (qpi, qpp, qpb) or was set by default, it will lead to better performance, if lower, then to a decrease in performance.

If we set the same QP parameters (qpi, qpp, qpb) for AVC and HEVC when encoding, then we align the same performance difference.

  • For AVC:

sample_multi_transcode -i::h265 test.h265 -o::h264 out.h264 -cqp -qpi 26 -qpp 26 -qpb 26

sample_multi_transcode -i::h265 test.h265 -o::h264 out.h264 -cqp -extmbqp

  • For HEVC:

sample_multi_transcode -i::h265 test.h265 -o::h265 out.h265 -cqp -qpi 26 -qpp 26 -qpb 26

sample_multi_transcode -i::h265 test.h265 -o::h265 out.h265 -cqp -extmbqp


And let me know if you can confirm this


Mark


0 Kudos