- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
H264 encoding using HyperEncoder slows down
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Sorry for the delay.
Please refer the below article to get more idea on HyperEncode.
https://github.com/oneapi-src/oneVPL-intel-gpu/blob/main/doc/HyperEncode_FeatureDeveloperGuide.md
Also, you can refer to the HyperEncode examples available in the Media Delivery repo (https://github.com/intel/media-delivery/tree/master/scripts) that was designed to boost transcode performance.
Please find below responses for your queries :
1.The GPU occupied rate is not high (40%), using hyper encode will it be slower?
Ans.: To improve the GPU utilization, user can create multiple streams and encode them using multiple sessions. Use parfiles in SMT to process multiple streams concurrently. This will help avoid a slowdown due to pipeline inefficiencies. Example: https://github.com/intel/media-delivery/blob/master/scripts/sample-multi-transcode-HEVC-8K-hyperenc2gpu.sh
2. Are there any problems with encoding and decoding at the same time?
Ans.: No there should not be any problems. Just create separate streams for encode and decode as shown in this example: https://github.com/intel/media-delivery/blob/master/scripts/sample-multi-transcode-HEVC-1080p.sh
Decoding on GPU is more optimal. You're decoding to video memory, so you don't have all of the overhead of working with large amounts of CPU memory and converting it to a tiled format.
Hope this will clarify your queries. If so, make sure to accept this as a solution. This would help others with similar issue.
Thanks
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thank you for posting in Intel communities.
Could you please provide the below details.
1) Exact OS details
2) Processor details
3) Kernel details
4) MediaSDK version
5) Sample Reproducer (sample code, steps to reproduce, commands you've used, etc
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply!
Here is the details of our environment
1) Exact OS details
Edition Windows 11 Pro
Version 22H2
Installed on 12/21/2022
OS build 22621.525
Experience Windows Feature Experience Pack 1000.22634.1000.0
2) Processor details
Device name DESKTOP-KB824A6
Processor 12th Gen Intel(R) Core(TM) i5-12400 2.50 GHz
Installed RAM 16.0 GB (15.8 GB usable)
Device ID 483AB5A6-FE2A-4D55-A44E-72A6BD223C02
Product ID 00330-80000-00000-AA775
System type 64-bit operating system, x64-based processor
Pen and touch No pen or touch input is available for this display
3) Kernel details
iGPU: Intel(R) UHD Graphics 730
dGPU: Intel(R) Arc(TM) A380 Graphics
4) MediaSDK version
Intel OneVPL, MFX_VERSION = 2006
5) Sample Reproducer
We use the sample_encoded.exe(https://github.com/oneapi-src/oneVPL/blob/master/tools/legacy/sample_encode/README.md) to encode the yuv frame into H264
With Hyper encode on:
Command Line: .\sample_encode.exe h264 -i test.yuv -o test.h264 -dGfx -dual_gfx::on -w 1920 -h 1080 -nv12 -idr_interval 0 -d3d11 -async 30 -g 30 -r 1 -u 4 -lowpower:on -n 8928
Encoding time is 38 seconds
With hyper encode off:
Command Line: .\sample_encode.exe h264 -i test.yuv -o test.h264 -dGfx -dual_gfx::off -w 1920 -h 1080 -nv12 -idr_interval 0 -d3d11 -async 30 -g 30 -r 1 -u 4 -lowpower:on -n 8928
Encoding time is 36s.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply!
Here is the details of our environment
1) Exact OS details
Edition Windows 11 Pro
Version 22H2
Installed on 12/21/2022
OS build 22621.525
Experience Windows Feature Experience Pack 1000.22634.1000.0
2) Processor details
Device name DESKTOP-KB824A6
Processor 12th Gen Intel(R) Core(TM) i5-12400 2.50 GHz
Installed RAM 16.0 GB (15.8 GB usable)
Device ID 483AB5A6-FE2A-4D55-A44E-72A6BD223C02
Product ID 00330-80000-00000-AA775
System type 64-bit operating system, x64-based processor
Pen and touch No pen or touch input is available for this display
3) Kernel details
iGPU: Intel(R) UHD Graphics 730
dGPU: Intel(R) Arc(TM) A380 Graphics
4) MediaSDK version
Intel OneVPL, MFX_VERSION = 2006
5) Sample Reproducer
We use the sample_encoded.exe(https://github.com/oneapi-src/oneVPL/blob/master/tools/legacy/sample_encode/README.md) to encode the yuv frame into H264
With Hyper encode on:
Command Line: .\sample_encode.exe h264 -i test.yuv -o test.h264 -dGfx -dual_gfx::on -w 1920 -h 1080 -nv12 -idr_interval 0 -d3d11 -async 30 -g 30 -r 1 -u 4 -lowpower:on -n 8928
Encoding time is 38 seconds
With hyper encode off:
Command Line: .\sample_encode.exe h264 -i test.yuv -o test.h264 -dGfx -dual_gfx::off -w 1920 -h 1080 -nv12 -idr_interval 0 -d3d11 -async 30 -g 30 -r 1 -u 4 -lowpower:on -n 8928
Encoding time is 36s.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for your patience.
Currently, in the sample_encode example, "-idr_interval 0" is not supported by HEVC Hyper Encode.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is not about h265, we are talking about h264 encode
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
It is not guaranteed that using HyperEncode will always improve performance. It depends on various associated metrics and how the hardware is being utilized. One way of getting better performance from HyperEncode is to use '-perf_opt n option' which preloads n first frames to buffers from the input stream. Also, metrics like async value and GOP size can greatly impact performance.
Moreover, if you have multiple GPUs on your system, it is recommended to split the stream into multiple segments and utilize the available GPUs simultaneously to improve performance.
More information and example code for selecting from among multiple GPUs is included in the guide here.
Also, could you please let us know what the resolution of your input stream is? You might not get substantial performance improvement from HyperEncode if the resolution is not high (>4K) enough.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have not heard back from you.
Is your issue resolved? Can we close the case?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply!
The resolution is 4096x2060, in the sample_encode example,use '-perf_opt n option' is improved performance .
But it doesn't work in my application. Is there anything else I should pay attention to?
1. the GPU occupied rate is not high (40%), using hyper encode will it be slower?
2. Are there any problems with encoding and decoding at the same time?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Sorry for the delay.
Please refer the below article to get more idea on HyperEncode.
https://github.com/oneapi-src/oneVPL-intel-gpu/blob/main/doc/HyperEncode_FeatureDeveloperGuide.md
Also, you can refer to the HyperEncode examples available in the Media Delivery repo (https://github.com/intel/media-delivery/tree/master/scripts) that was designed to boost transcode performance.
Please find below responses for your queries :
1.The GPU occupied rate is not high (40%), using hyper encode will it be slower?
Ans.: To improve the GPU utilization, user can create multiple streams and encode them using multiple sessions. Use parfiles in SMT to process multiple streams concurrently. This will help avoid a slowdown due to pipeline inefficiencies. Example: https://github.com/intel/media-delivery/blob/master/scripts/sample-multi-transcode-HEVC-8K-hyperenc2gpu.sh
2. Are there any problems with encoding and decoding at the same time?
Ans.: No there should not be any problems. Just create separate streams for encode and decode as shown in this example: https://github.com/intel/media-delivery/blob/master/scripts/sample-multi-transcode-HEVC-1080p.sh
Decoding on GPU is more optimal. You're decoding to video memory, so you don't have all of the overhead of working with large amounts of CPU memory and converting it to a tiled format.
Hope this will clarify your queries. If so, make sure to accept this as a solution. This would help others with similar issue.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have not heard back from you. Is your query clarified?
Could you please provide an update?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.
Thanks
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page