Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

HyperEncoder

ShiYue
Novice
1,980 Views

H264 encoding using HyperEncoder slows down

0 Kudos
1 Solution
Rahila_T_Intel
Moderator
1,498 Views

Hi,


Sorry for the delay.


Please refer the below article to get more idea on HyperEncode. 

https://github.com/oneapi-src/oneVPL-intel-gpu/blob/main/doc/HyperEncode_FeatureDeveloperGuide.md

Also, you can refer to the HyperEncode examples available in the Media Delivery repo (https://github.com/intel/media-delivery/tree/master/scripts) that was designed to boost transcode performance.


Please find below responses for your queries :


1.The GPU occupied rate is not high (40%), using hyper encode will it be slower?


Ans.: To improve the GPU utilization, user can create multiple streams and encode them using multiple sessions. Use parfiles in SMT to process multiple streams concurrently. This will help avoid a slowdown due to pipeline inefficiencies. Example: https://github.com/intel/media-delivery/blob/master/scripts/sample-multi-transcode-HEVC-8K-hyperenc2gpu.sh


2. Are there any problems with encoding and decoding at the same time?


Ans.: No there should not be any problems. Just create separate streams for encode and decode as shown in this example: https://github.com/intel/media-delivery/blob/master/scripts/sample-multi-transcode-HEVC-1080p.sh

Decoding on GPU is more optimal. You're decoding to video memory, so you don't have all of the overhead of working with large amounts of CPU memory and converting it to a tiled format.


Hope this will clarify your queries. If so, make sure to accept this as a solution. This would help others with similar issue. 


Thanks


View solution in original post

0 Kudos
11 Replies
Rahila_T_Intel
Moderator
1,945 Views

Hi,


Thank you for posting in Intel communities.


Could you please provide the below details.

1) Exact OS details

2) Processor details

3) Kernel details 

4) MediaSDK version

5) Sample Reproducer (sample code, steps to reproduce, commands you've used, etc


Thanks


0 Kudos
ShiYue
Novice
1,879 Views

Thanks for your reply!

Here is the details of our environment
1) Exact OS details

Edition Windows 11 Pro
Version 22H2
Installed on ‎12/‎21/‎2022
OS build 22621.525
Experience Windows Feature Experience Pack 1000.22634.1000.0

2) Processor details

Device name DESKTOP-KB824A6
Processor 12th Gen Intel(R) Core(TM) i5-12400 2.50 GHz
Installed RAM 16.0 GB (15.8 GB usable)
Device ID 483AB5A6-FE2A-4D55-A44E-72A6BD223C02
Product ID 00330-80000-00000-AA775
System type 64-bit operating system, x64-based processor
Pen and touch No pen or touch input is available for this display

3) Kernel details

iGPU: Intel(R) UHD Graphics 730
dGPU: Intel(R) Arc(TM) A380 Graphics

4) MediaSDK version

Intel OneVPL, MFX_VERSION = 2006

5) Sample Reproducer
We use the sample_encoded.exe(https://github.com/oneapi-src/oneVPL/blob/master/tools/legacy/sample_encode/README.md) to encode the yuv frame into H264

With Hyper encode on:
Command Line: .\sample_encode.exe h264 -i test.yuv -o test.h264 -dGfx -dual_gfx::on -w 1920 -h 1080 -nv12 -idr_interval 0 -d3d11 -async 30 -g 30 -r 1 -u 4 -lowpower:on -n 8928
Encoding time is 38 seconds


With hyper encode off:
Command Line: .\sample_encode.exe h264 -i test.yuv -o test.h264 -dGfx -dual_gfx::off -w 1920 -h 1080 -nv12 -idr_interval 0 -d3d11 -async 30 -g 30 -r 1 -u 4 -lowpower:on -n 8928
Encoding time is 36s.

0 Kudos
ShiYue
Novice
1,921 Views

Thanks for your reply!

Here is the details of our environment
1) Exact OS details

Edition Windows 11 Pro
Version 22H2
Installed on ‎12/‎21/‎2022
OS build 22621.525
Experience Windows Feature Experience Pack 1000.22634.1000.0

2) Processor details

Device name DESKTOP-KB824A6
Processor 12th Gen Intel(R) Core(TM) i5-12400 2.50 GHz
Installed RAM 16.0 GB (15.8 GB usable)
Device ID 483AB5A6-FE2A-4D55-A44E-72A6BD223C02
Product ID 00330-80000-00000-AA775
System type 64-bit operating system, x64-based processor
Pen and touch No pen or touch input is available for this display

3) Kernel details

iGPU: Intel(R) UHD Graphics 730
dGPU: Intel(R) Arc(TM) A380 Graphics

4) MediaSDK version

Intel OneVPL, MFX_VERSION = 2006

5) Sample Reproducer
We use the sample_encoded.exe(https://github.com/oneapi-src/oneVPL/blob/master/tools/legacy/sample_encode/README.md) to encode the yuv frame into H264

With Hyper encode on:
Command Line: .\sample_encode.exe h264 -i test.yuv -o test.h264 -dGfx -dual_gfx::on -w 1920 -h 1080 -nv12 -idr_interval 0 -d3d11 -async 30 -g 30 -r 1 -u 4 -lowpower:on -n 8928
Encoding time is 38 seconds


With hyper encode off:
Command Line: .\sample_encode.exe h264 -i test.yuv -o test.h264 -dGfx -dual_gfx::off -w 1920 -h 1080 -nv12 -idr_interval 0 -d3d11 -async 30 -g 30 -r 1 -u 4 -lowpower:on -n 8928
Encoding time is 36s.

0 Kudos
Rahila_T_Intel
Moderator
1,859 Views

Hi,

 

Thanks for your patience.

 

Currently, in the sample_encode example, "-idr_interval 0" is not supported by HEVC Hyper Encode.

 

 

 

Thanks

 

0 Kudos
ShiYue
Novice
1,845 Views

This is not about h265, we are talking about h264 encode

0 Kudos
Rahila_T_Intel
Moderator
1,825 Views

Hi,

 

It is not guaranteed that using HyperEncode will always improve performance. It depends on various associated metrics and how the hardware is being utilized. One way of getting better performance from HyperEncode is to use '-perf_opt n option' which preloads n first frames to buffers from the input stream. Also, metrics like async value and GOP size can greatly impact performance.

 

Moreover, if you have multiple GPUs on your system, it is recommended to split the stream into multiple segments and utilize the available GPUs simultaneously to improve performance.

More information and example code for selecting from among multiple GPUs is included in the guide here

 

Also, could you please let us know what the resolution of your input stream is? You might not get substantial performance improvement from HyperEncode if the resolution is not high (>4K) enough.

 

Thanks

 

0 Kudos
Rahila_T_Intel
Moderator
1,564 Views

Hi,


We have not heard back from you.

Is your issue resolved? Can we close the case?


Thanks



0 Kudos
ShiYue
Novice
1,558 Views

Thanks for your reply!

 

The resolution is 4096x2060, in the sample_encode example,use '-perf_opt n option'   is   improved performance .

But it doesn't work in my application.  Is there anything else I should pay attention to? 

1. the GPU occupied rate is not high (40%), using hyper encode will it be slower?

2. Are  there any problems with encoding and decoding at the same time?

 

0 Kudos
Rahila_T_Intel
Moderator
1,499 Views

Hi,


Sorry for the delay.


Please refer the below article to get more idea on HyperEncode. 

https://github.com/oneapi-src/oneVPL-intel-gpu/blob/main/doc/HyperEncode_FeatureDeveloperGuide.md

Also, you can refer to the HyperEncode examples available in the Media Delivery repo (https://github.com/intel/media-delivery/tree/master/scripts) that was designed to boost transcode performance.


Please find below responses for your queries :


1.The GPU occupied rate is not high (40%), using hyper encode will it be slower?


Ans.: To improve the GPU utilization, user can create multiple streams and encode them using multiple sessions. Use parfiles in SMT to process multiple streams concurrently. This will help avoid a slowdown due to pipeline inefficiencies. Example: https://github.com/intel/media-delivery/blob/master/scripts/sample-multi-transcode-HEVC-8K-hyperenc2gpu.sh


2. Are there any problems with encoding and decoding at the same time?


Ans.: No there should not be any problems. Just create separate streams for encode and decode as shown in this example: https://github.com/intel/media-delivery/blob/master/scripts/sample-multi-transcode-HEVC-1080p.sh

Decoding on GPU is more optimal. You're decoding to video memory, so you don't have all of the overhead of working with large amounts of CPU memory and converting it to a tiled format.


Hope this will clarify your queries. If so, make sure to accept this as a solution. This would help others with similar issue. 


Thanks


0 Kudos
Rahila_T_Intel
Moderator
1,452 Views

Hi,


We have not heard back from you.  Is your query clarified?

Could you please provide an update?


Thanks


0 Kudos
Rahila_T_Intel
Moderator
1,399 Views

Hi,


We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.


Thanks


0 Kudos
Reply