Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

Highest Encoding Performance Settings

seth_p_1
Beginner
936 Views

Does anyone have a link to example code or example settings for the highest performance h264 encoding possible?  I ran a test and MFX_IMPL_HARDWARE_ANY and MFX_IMPL_SOFTWARE had a difference of 100 FPS versus 44 FPS.  I can't imagine that should be the case.

What settings are MOST responsible for a performance increase? 

0 Kudos
13 Replies
Petter_L_Intel
Employee
936 Views

Hi,

"Highest" performance is a very relative term.

Encoder performance depends on many different parameters/decisions. Below are a just a few:
- Codec profile. A lower profile will mean less complex computations, thus greater performance
- TargetUsage: Speed setting will give greater performance
- Bitrate: A low bitrate will lead to greater performance
- Selected surface type may also impact overall performance. For instance, if using system memory surfaces with HW encode it implies internal surface copy, which will impact performance 
- Asynchronouse vs. Synchronous pipeline implementation will also have a large impact on overall performance for single channel workloads. 

Keep in mind that performance will always be a trade-off to quality. If you set encode parameters to get very high performance it will also result in low quality.

Regards,
Petter 

0 Kudos
seth_p_1
Beginner
936 Views

Thanks.   I'm looking for the differential factors that will pull the hardware encoding units away from straight software.  So, all quality components being within an acceptable range, what parameters must be tweaked to force the greatest fixed function execution of the coding subsystems (ME, entropy coding, etc).  I found the target after a little while.  Bitrate surpised me a little (that must not be linear - up to X you work hard to keep quality and lower bits, after X you can just start throwing away data).

In the current setup Async wasn't pulling away.  But I imagine that at some point in various loads it will.

One extra thing that perhaps you might have come across: is there any way to mix encoding between a discrete GPU and Intel's hardware?  The GPU in this scenario will have lots of extra power to do things like DCT, etc that would not only take load off the CPU but also reduce the transfer bandwidth.  My gut says it's all or nothing.  What do you think?

Seth

0 Kudos
Petter_L_Intel
Employee
936 Views

Hi Seth,

If you are interested in exploring the async vs. sync impact Media SDK performance then I suggest you check out the first few chapters of the Media SDK tutorial here: http://software.intel.com/en-us/articles/intel-media-sdk-tutorial

Intel Media SDK does not support discrete GPU's.

Regards,
Petter 

0 Kudos
seth_p_1
Beginner
936 Views

Thanks, Petter.  I did try the various tutorials.  I think load will diffferentiate.

BTW, I see that you are from Intel.  Is that correct?  Although it may not be able to integrate with descrete GPU hardware, is there any way to feed the SDK data that is more fully prepared than bitplanes?  That should free EU resources and reduce transfer bandwidth.

The alternate solution we're looking at is NVENC - h264 coding on the GPU itself.  That has several advantages in latency at the expense of price.

Thanks again,

0 Kudos
seth_p_1
Beginner
936 Views

btw, do you know of a fast path to the linux beta program?  i was hoping i might try a cloud test on an ivy farm using hd4000 gpu this weekend while i have time

0 Kudos
andy4us
Beginner
936 Views

I'm somewhat confused why you would think the difference between using using a software codec and a dedicated hardware codec should be small. The whole point is that in software you can't do it fast enough. I'm actually surpsied you got 44fps ! That's quite a bit more that I was able to achieve. Fundamently, it is that difference that caused Intel to implement a hardware codec in the first place.

Andy

0 Kudos
seth_p_1
Beginner
936 Views

You might have misread.  Or I might have mistated.  The difference I was seeing was smaller than expected.  The difference I would expect is large.

0 Kudos
Petter_L_Intel
Employee
936 Views

Seth,

What do you mean by "data that is more fully prepared than bitplanes"? And please also elaborate on why you think using NVENC on discrete card provide lower latency?

Regarding the Linux beta program. If you signed up and requested beta access you will be contacted shortly.

Regards,
Petter 

0 Kudos
seth_p_1
Beginner
936 Views

from what it looks like now (in the linux documentation), dct/me/etc is already happening on the cpu's gpu (ENC) in prep to feed PAK (MFX/VCE).  if there's horsepower available, duplicating some of ENC on the discrete GPU would reduce transfer bandwidth (multiple HD streams) and reduce load on the cpu.

regarding lower latency, i mean reducing latency in a hybrid discrete + cpu environment.  obviously if all rendering and encoding happens on the CPU you can hit low latency.  if you leverage a discrete GPU for rendering, transferring across the bus and utilizing frame or memory pools will incrementally increase latency.

0 Kudos
Petter_L_Intel
Employee
936 Views

Hi Seth,

Media SDK does not support access to granular parts of the encoding process. ENC and PAK stages are executed as one single operation via the EncodeFrameAsync() call. All of the encode stages are executed on the Intel HD Graphics part of the Processor.

Regarding latency. HW accelerated encode/decode/frame processing using Media SDK provides very low latency. If you have concerns about low latency usages please provide more details about your pipeline. We can help ensure you configure the components for optimal latency.

Regards,
Petter 

0 Kudos
seth_p_1
Beginner
936 Views

This can be considered through the web gaming.  So latency (user interaction, reaction) is very important.  Using NVENC is very expensive (monetarily expensive) so if it's not necessary then we'd like to use the best of both worlds options.  BTW, I haven't heard from anyone since last week regarding Linux SDK.  Do you know if there's someone to check in with?

0 Kudos
seth_p_1
Beginner
936 Views

BTW, is it truly the case that the linux graphics stack (including encoding library) is open here:

https://01.org/linuxgraphics/downloads/2013/2013q1-intel-graphics-stack-release

if that's so that's a pretty compelling case to go very intel.  Are there any portions of the encoding library that aren't open?

0 Kudos
Petter_L_Intel
Employee
936 Views

Hi Seth,

Regarding latency: Refer to the following post since it may relate well to to game capture case you describe
http://software.intel.com/en-us/forums/topic/391281 

Regarding Intel Media SDK for Server usage: see this recent post:
http://software.intel.com/en-us/forums/topic/386795 

I do not have control over the Linux SDK beta request procedure. You should be contacted shortly.

Regards,
Petter 

0 Kudos
Reply