Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Real time encoding of 1080p

Robert_Jongbloed
Beginner
1,265 Views

I am paraphrasing a different thread I have running here, but this is a bit more general a question.

Is it in possible to do real time encoding of 1080p video at 30fps, preferabling using H.264, but any codec would do.

Is it even possible to do 720p?

Has anyone done this?

0 Kudos
16 Replies
Robert_Jongbloed
Beginner
1,265 Views

I gather that from the profound silence to this question, that no one has ever done real time encoding of H.264 for HD.

Has anyone done it for any resolution?

0 Kudos
Jeffrey_M_Intel1
Employee
1,265 Views

Try Media SDK.  To start, please see

http://software.intel.com/en-us/vcsource/tools/media-sdk

and

http://software.intel.com/en-us/articles/intel-media-sdk-tutorial

For real-time encode/transcode, you're likely to see that Media SDK is faster than real time (even for HD).  One way around this is to just sleep for any remaining time after each frame is finished.

 

 

0 Kudos
Robert_Jongbloed
Beginner
1,265 Views

Sigh. Unfortunately, that library doesn't have a feature I need. :-(

There is no way, that I could see from the documentation, for me to set the maximum NAL unit size in bytes. This is required for encoding RFC 3984 compliant packetisation mode zero RTP packets.

IPP samples has the feature but is stupidly slow, Media SDK is (reportedly) fast but doesn't have the feature.

I can't believe that I am the only one, or the first, to want to do this. You can't to an H.323/SIP endpoint without it! What am I missing?

0 Kudos
Thomas_Jensen1
Beginner
1,265 Views

Anyone who ever tried to transcode a movie will know that realtime h.264 of HD is not possible, unless you have a hardware-assisted encoder, like i.e. Cuda.

You could do it with IPP, but then you'd have to have a 128-core system...
Much cheaper to just use a few-core system with a 3D gamer card from ATI or NVidia, and then some h.264 encoder that can fully utilize the 3D card.

I don't know of any SDK-type, but I do know that there are many commercial transcoder products that utilize 3D cards.

0 Kudos
Robert_Jongbloed
Beginner
1,265 Views

Thomas Jensen wrote:

Anyone who ever tried to transcode a movie will know that realtime h.264 of HD is not possible, unless you have a hardware-assisted encoder, like i.e. Cuda.

I do not believe this to be the case. Check out http://www.videolan.org/developers/x264.html who claim: "Achieves dramatic performance, encoding 4 or more 1080p streams in realtime on a single consumer-level computer.".

There is also a huge difference between trying to encode a broadcast quality movie, and a videotelephony call. That is really what all those H,.264 "profiles" are about, are the not?

Thomas Jensen wrote:

You could do it with IPP, but then you'd have to have a 128-core system...
Much cheaper to just use a few-core system with a 3D gamer card from ATI or NVidia, and then some h.264 encoder that can fully utilize the 3D card.

Some simple "back of the evelope" calculations on what I have observed with IPP so far is that is should be able to do it. If I disable multi-threading, I get about 6 fps encoding speed. That is it being stuck in a single core. Clearly 30fps does not require 128 cores, but only 5, maybe 6 with overhead. However, there appears to be a bug in IPP where, if I turn on multi-theading, it creates 8 threads, each uses about 5% CPU and the net encoding rate goes down!

0 Kudos
Thomas_Jensen1
Beginner
1,265 Views

I was surely overstating when writing 128 cores, but that was on purpose, to get the discussion going.

When doing 1080p h.264, you are talking about Full HD, or actually, BlueDisk/BlueRay formats.
I can imagine that IPP can do it with 6 cores, but a computer doing 6 cores at full throttle is more or less useless for any other purpose while processing.

When I said to take a normal computer and add a 3D card to it, and then letting the encoder utilize it via i.e. Cuda, I was forgetting Intels new IPP asynchronious library, which it seems can utilize Intels new CPU/GPU fixed functions capability to do accelerated 1080p h.264 encoding. This looks promising, except for that fact that if you want to create an end-user product or function, then it would be very limited if it required a particular cpu family only, excluding all AMD and all older Intel cpus.

0 Kudos
Jeffrey_M_Intel1
Employee
1,265 Views

With the right speed/quality presets x264 can also do faster-than-realtime HD transcodes -- but there are many ways to do that as mentioned in this thread, including Media SDK.  As mentioned before, everything depends on the level of speed vs. quality required.

It is true that the hardware does not support setting NAL size.  This is a known limitation with many recorded change requests, and it is understood that this can be especially hard for older decoders which have difficulty with split NALs.  However, we hope that until a fix is available that this limitation can be worked around.

Media SDK is now Intel's only hardware enabling product dedicated to media codecs.  We're hoping the new IPP OpenCL and Asynchronous functions will play a role in media pipelines, but not via access to fixed function codec components.  The current previews are not there yet, but the goal is support for pipelines like this:

(Media SDK decode)->(IPP frame processing)->(Media SDK encode)

The great part about the new GPU-enabled functions is that, in theory, all of this can be done efficiently without extra CPU<->GPU synchronization and copies.  The focus is on GPU acceleration today, but long term we hope to also support CPU/software piplines for increased throughput (CPU and GPU pipelines at the same time) and broader portability.

0 Kudos
Roman_T_
New Contributor I
1,265 Views

Hi all!

Sorry for offtop.

My task is very similar but not so complicated and less heavy for CPU.

I just need to decode mp4 file with H264 inside and display frames from file in a window (some kind of playback)

I found an IPP sample with playing a video file. But I can't find any documentation about it.

Can anybody suggest me some kind of manual about H264 decoding?

0 Kudos
Robert_Jongbloed
Beginner
1,265 Views

Thomas Jensen wrote:

When doing 1080p h.264, you are talking about Full HD, or actually, BlueDisk/BlueRay formats.
I can imagine that IPP can do it with 6 cores, but a computer doing 6 cores at full throttle is more or less useless for any other purpose while processing.

I intimated it in the previous post, but let me be more explicit. I am talking about a videteleophony call, think video conference. Nothing to do with streaming, or storage files, or Blue Ray, which have different requirements and limitations. It is not unusual to have 1080p on the decode side in a video conference, but we wish it for the encoder as well. It is not a big deal for us if the PC is not usable for anything else while in this mode.

And it should be possible, with the right software, and/or settings.

0 Kudos
Robert_Jongbloed
Beginner
1,265 Views

Jeffrey Mcallister (Intel) wrote:

With the right speed/quality presets x264 can also do faster-than-realtime HD transcodes -- but there are many ways to do that as mentioned in this thread, including Media SDK.  As mentioned before, everything depends on the level of speed vs. quality required.

I agree, but I am yet to find the correct settings for all my requirements in the Intel offerings.

Jeffrey Mcallister (Intel) wrote:

It is true that the hardware does not support setting NAL size.  This is a known limitation with many recorded change requests, and it is understood that this can be especially hard for older decoders which have difficulty with split NALs.  However, we hope that until a fix is available that this limitation can be worked around.

And this is the show stopper for us. There are just too many systems out there than cannot do fragmented (FU) NAL units. And encoding a single 20kbyte UDP packet isn't an option either. We need to limit the size of encoded H.264 slices not by macro blocks, but by bytes.

I am unwilling to spend the time rewriting our code for Media SDK when I am fairly sure it will not meet the requirement. Though, if this turns out to be the only way, I will push it up my hierarchy, and see what comprimises can be made.

Jeffrey Mcallister (Intel) wrote:

Media SDK is now Intel's only hardware enabling product dedicated to media codecs.  We're hoping the new IPP OpenCL and Asynchronous functions will play a role in media pipelines, but not via access to fixed function codec components.  The current previews are not there yet, but the goal is support for pipelines like this:

(Media SDK decode)->(IPP frame processing)->(Media SDK encode)

The great part about the new GPU-enabled functions is that, in theory, all of this can be done efficiently without extra CPU<->GPU synchronization and copies.  The focus is on GPU acceleration today, but long term we hope to also support CPU/software piplines for increased throughput (CPU and GPU pipelines at the same time) and broader portability.

I was hoping for a solution that did not require special hardware, just a reasonably late model Intel CPU. As said in another post, I don't care if it the CPU is maxed out doing it, so long as it gets there.

0 Kudos
Sergey_K_Intel
Employee
1,265 Views

Hi Robert,

It happened that I've just checked 1080p encoding on Haswell (4 cores @ 2.3 GHz). With baseline profile the encoding rate is 33.6 FPS. Media info is:

Video
Format : AVC
Format/Info : Advanced Video Codec
Format profile : Baseline@L4.0
Format settings, CABAC : No
Format settings, ReFrames : 2 frames
Width : 1 920 pixels
Height : 1 080 pixels
Display aspect ratio : 16:9
Frame rate : 30.000 fps
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive

100% CPU load. Sorry )).

Regards,
Sergey 

0 Kudos
Robert_Jongbloed
Beginner
1,265 Views

Sergey Khlystov (Intel) wrote:

It happened that I've just checked 1080p encoding on Haswell (4 cores @ 2.3 GHz). With baseline profile the encoding rate is 33.6 FPS. Media info is:

....

100% CPU load. Sorry )).

Thank you very much, 100% CPU I can deal with. That was with Media SDK? Or IPP?

0 Kudos
Sergey_K_Intel
Employee
1,265 Views

Hi Robert,

It was IPP.

I believe that with Media SDK the results will be much better, but we (@ IPP) don't yet have working sample of that kind.I would like to hope that Media SDK folks will do this to push their SDK.

Regards,
Sergey 

0 Kudos
Robert_Jongbloed
Beginner
1,265 Views

Sergey Khlystov (Intel) wrote:

Hi Robert,

It was IPP.

I believe that with Media SDK the results will be much better, but we (@ IPP) don't yet have working sample of that kind.I would like to hope that Media SDK folks will do this to push their SDK.

Regards,
Sergey 

Until Media SDK supports the equivalent of max_slice_size in IPP, then it will be incompatible with a lot of videotelephony (SIP/H.323) systems out there.

0 Kudos
Robert_Jongbloed
Beginner
1,265 Views

Did some more testing.

First, my older i7 (Bloomfield) is really not very good (4.6 fps). A borrowed Q9400 (Yorkfield-6M) is much faster (18 fps with max_slice_size=0 and 7.5 fps with max_slice_size=1400). I have no idea about those processors relative capabilities or performance. Maybe if I got the latest CPU it would make the 30 fps, but that seems ... excessive.

Second, what is a "Haswell" processor?

Finally, my test code is really fairly simple, read from a file, encode it. If people could cast a quick eye over the blow, is there something stupid in there?

  UMC::H264EncoderParams EncoderParams;
  EncoderParams.profile_idc = H264_PROFILE_BASELINE;
  EncoderParams.level_idc = 40;
  EncoderParams.num_ref_frames = 2;
  EncoderParams.entropy_coding_mode_flag = 0;

  UMC::VideoData raw;
  if (raw.Init(EncoderParams.m_info.videoInfo.m_iWidth, EncoderParams.m_info.videoInfo.m_iHeight, YUV420) != UMC_OK)
    return 1;
  if (raw.Alloc() != UMC_OK)
    return 1;

  UMC::MediaData enc;
  if (enc.Alloc(EncoderParams.m_info.videoInfo.m_iWidth*EncoderParams.m_info.videoInfo.m_iHeight) != UMC_OK)
    return 1;

  UMC::H264VideoEncoder encoder;
  if (encoder.Init(&EncoderParams) != UMC_OK)
    return 1;

  vm_tick start = vm_time_get_tick();
  unsigned frames = 0;
  unsigned bytes = 0;

  while (!y4m.eof()) {
    y4m.getline(line, sizeof(line));
    y4m.read((char *)raw.GetBufferPointer(), raw.GetBufferSize());

    if (encoder.GetFrame(&raw, &enc) != UMC_OK)
      return 1;

    ++frames;
    bytes += enc.GetDataSize();
    enc.Reset();
  }

0 Kudos
Roman_T_
New Contributor I
1,265 Views

Sergey Khlystov (Intel) wrote:

Hi Robert,

It was IPP.

I believe that with Media SDK the results will be much better, but we (@ IPP) don't yet have working sample of that kind.I would like to hope that Media SDK folks will do this to push their SDK.

Regards,
Sergey 

Hi Sergey!

As I know IPP is a low level library and it's difficult to organize such a process as H.264 encoding without additional software like UMC (however it may be possible).

Have you used UMC +IPP or it was your own solution + IPP ?

0 Kudos
Reply