Media (Intel® oneAPI Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools from Intel. This includes Intel® oneAPI Video Processing Library and Intel® Media SDK.
Announcements
The Intel sign-in experience is changing in February to support enhanced security controls. If you sign in, click here for more information.

HyperEncoder

ShiYue
Novice
619 Views

H264 encoding using HyperEncoder slows down

0 Kudos
1 Solution
Rahila_T_Intel
Moderator
137 Views

Hi,


Sorry for the delay.


Please refer the below article to get more idea on HyperEncode. 

https://github.com/oneapi-src/oneVPL-intel-gpu/blob/main/doc/HyperEncode_FeatureDeveloperGuide.md

Also, you can refer to the HyperEncode examples available in the Media Delivery repo (https://github.com/intel/media-delivery/tree/master/scripts) that was designed to boost transcode performance.


Please find below responses for your queries :


1.The GPU occupied rate is not high (40%), using hyper encode will it be slower?


Ans.: To improve the GPU utilization, user can create multiple streams and encode them using multiple sessions. Use parfiles in SMT to process multiple streams concurrently. This will help avoid a slowdown due to pipeline inefficiencies. Example: https://github.com/intel/media-delivery/blob/master/scripts/sample-multi-transcode-HEVC-8K-hyperenc2...


2. Are there any problems with encoding and decoding at the same time?


Ans.: No there should not be any problems. Just create separate streams for encode and decode as shown in this example: https://github.com/intel/media-delivery/blob/master/scripts/sample-multi-transcode-HEVC-1080p.sh

Decoding on GPU is more optimal. You're decoding to video memory, so you don't have all of the overhead of working with large amounts of CPU memory and converting it to a tiled format.


Hope this will clarify your queries. If so, make sure to accept this as a solution. This would help others with similar issue. 


Thanks


View solution in original post

11 Replies
Rahila_T_Intel
Moderator
584 Views

Hi,


Thank you for posting in Intel communities.


Could you please provide the below details.

1) Exact OS details

2) Processor details

3) Kernel details 

4) MediaSDK version

5) Sample Reproducer (sample code, steps to reproduce, commands you've used, etc


Thanks


ShiYue
Novice
518 Views

Thanks for your reply!

Here is the details of our environment
1) Exact OS details

Edition Windows 11 Pro
Version 22H2
Installed on ‎12/‎21/‎2022
OS build 22621.525
Experience Windows Feature Experience Pack 1000.22634.1000.0

2) Processor details

Device name DESKTOP-KB824A6
Processor 12th Gen Intel(R) Core(TM) i5-12400 2.50 GHz
Installed RAM 16.0 GB (15.8 GB usable)
Device ID 483AB5A6-FE2A-4D55-A44E-72A6BD223C02
Product ID 00330-80000-00000-AA775
System type 64-bit operating system, x64-based processor
Pen and touch No pen or touch input is available for this display

3) Kernel details

iGPU: Intel(R) UHD Graphics 730
dGPU: Intel(R) Arc(TM) A380 Graphics

4) MediaSDK version

Intel OneVPL, MFX_VERSION = 2006

5) Sample Reproducer
We use the sample_encoded.exe(https://github.com/oneapi-src/oneVPL/blob/master/tools/legacy/sample_encode/README.md) to encode the yuv frame into H264

With Hyper encode on:
Command Line: .\sample_encode.exe h264 -i test.yuv -o test.h264 -dGfx -dual_gfx::on -w 1920 -h 1080 -nv12 -idr_interval 0 -d3d11 -async 30 -g 30 -r 1 -u 4 -lowpower:on -n 8928
Encoding time is 38 seconds


With hyper encode off:
Command Line: .\sample_encode.exe h264 -i test.yuv -o test.h264 -dGfx -dual_gfx::off -w 1920 -h 1080 -nv12 -idr_interval 0 -d3d11 -async 30 -g 30 -r 1 -u 4 -lowpower:on -n 8928
Encoding time is 36s.

ShiYue
Novice
560 Views

Thanks for your reply!

Here is the details of our environment
1) Exact OS details

Edition Windows 11 Pro
Version 22H2
Installed on ‎12/‎21/‎2022
OS build 22621.525
Experience Windows Feature Experience Pack 1000.22634.1000.0

2) Processor details

Device name DESKTOP-KB824A6
Processor 12th Gen Intel(R) Core(TM) i5-12400 2.50 GHz
Installed RAM 16.0 GB (15.8 GB usable)
Device ID 483AB5A6-FE2A-4D55-A44E-72A6BD223C02
Product ID 00330-80000-00000-AA775
System type 64-bit operating system, x64-based processor
Pen and touch No pen or touch input is available for this display

3) Kernel details

iGPU: Intel(R) UHD Graphics 730
dGPU: Intel(R) Arc(TM) A380 Graphics

4) MediaSDK version

Intel OneVPL, MFX_VERSION = 2006

5) Sample Reproducer
We use the sample_encoded.exe(https://github.com/oneapi-src/oneVPL/blob/master/tools/legacy/sample_encode/README.md) to encode the yuv frame into H264

With Hyper encode on:
Command Line: .\sample_encode.exe h264 -i test.yuv -o test.h264 -dGfx -dual_gfx::on -w 1920 -h 1080 -nv12 -idr_interval 0 -d3d11 -async 30 -g 30 -r 1 -u 4 -lowpower:on -n 8928
Encoding time is 38 seconds


With hyper encode off:
Command Line: .\sample_encode.exe h264 -i test.yuv -o test.h264 -dGfx -dual_gfx::off -w 1920 -h 1080 -nv12 -idr_interval 0 -d3d11 -async 30 -g 30 -r 1 -u 4 -lowpower:on -n 8928
Encoding time is 36s.

Rahila_T_Intel
Moderator
498 Views

Hi,

 

Thanks for your patience.

 

Currently, in the sample_encode example, "-idr_interval 0" is not supported by HEVC Hyper Encode.

 

 

 

Thanks

 

ShiYue
Novice
484 Views

This is not about h265, we are talking about h264 encode

Rahila_T_Intel
Moderator
464 Views

Hi,

 

It is not guaranteed that using HyperEncode will always improve performance. It depends on various associated metrics and how the hardware is being utilized. One way of getting better performance from HyperEncode is to use '-perf_opt n option' which preloads n first frames to buffers from the input stream. Also, metrics like async value and GOP size can greatly impact performance.

 

Moreover, if you have multiple GPUs on your system, it is recommended to split the stream into multiple segments and utilize the available GPUs simultaneously to improve performance.

More information and example code for selecting from among multiple GPUs is included in the guide here

 

Also, could you please let us know what the resolution of your input stream is? You might not get substantial performance improvement from HyperEncode if the resolution is not high (>4K) enough.

 

Thanks

 

Rahila_T_Intel
Moderator
203 Views

Hi,


We have not heard back from you.

Is your issue resolved? Can we close the case?


Thanks



ShiYue
Novice
197 Views

Thanks for your reply!

 

The resolution is 4096x2060, in the sample_encode example,use '-perf_opt n option'   is   improved performance .

But it doesn't work in my application.  Is there anything else I should pay attention to? 

1. the GPU occupied rate is not high (40%), using hyper encode will it be slower?

2. Are  there any problems with encoding and decoding at the same time?

 

Rahila_T_Intel
Moderator
138 Views

Hi,


Sorry for the delay.


Please refer the below article to get more idea on HyperEncode. 

https://github.com/oneapi-src/oneVPL-intel-gpu/blob/main/doc/HyperEncode_FeatureDeveloperGuide.md

Also, you can refer to the HyperEncode examples available in the Media Delivery repo (https://github.com/intel/media-delivery/tree/master/scripts) that was designed to boost transcode performance.


Please find below responses for your queries :


1.The GPU occupied rate is not high (40%), using hyper encode will it be slower?


Ans.: To improve the GPU utilization, user can create multiple streams and encode them using multiple sessions. Use parfiles in SMT to process multiple streams concurrently. This will help avoid a slowdown due to pipeline inefficiencies. Example: https://github.com/intel/media-delivery/blob/master/scripts/sample-multi-transcode-HEVC-8K-hyperenc2...


2. Are there any problems with encoding and decoding at the same time?


Ans.: No there should not be any problems. Just create separate streams for encode and decode as shown in this example: https://github.com/intel/media-delivery/blob/master/scripts/sample-multi-transcode-HEVC-1080p.sh

Decoding on GPU is more optimal. You're decoding to video memory, so you don't have all of the overhead of working with large amounts of CPU memory and converting it to a tiled format.


Hope this will clarify your queries. If so, make sure to accept this as a solution. This would help others with similar issue. 


Thanks


Rahila_T_Intel
Moderator
91 Views

Hi,


We have not heard back from you.  Is your query clarified?

Could you please provide an update?


Thanks


Rahila_T_Intel
Moderator
38 Views

Hi,


We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.


Thanks


Reply