Hi John - The numbers for

John_H_7 · ‎04-24-2015

Hi,

I have been doing some performance testing with your encode and decode samples, and comparing results with the same operations done on the same system using FFMPEG.

Encoding is significantly faster - 4x that of FFMPEG, while only using 10% of the CPU.

Decoding - FFMPEG wins. Overall, it takes twice as long to do a decode using the Intel GPU as it does using FFMPEG. Would you have any ideas why that might be? I might guess that the large amount of YUV data produced is bottlenecked returning from GPU to main data bus.

System is a Supero X10SLH-F. E3-1285-v3 Xeon and P4700 GPU with C226 chipset. 8 virtual cores, 3.6 GHz 12 GB memory. Open source computer animation “Big Buck Bunny” used. H.264, 720p, ~ 10 minutes in length. OS is CentOS 7.

Here's the decode: ./sample_decode_drm h264 -i big-buck-bunny_294_1280x720.h264 -o /dev/null -hw

The output is thrown away to minimize any I/O delays in writing out the YUV data.

Thanks.

John

Sravanthi_K_Intel · ‎04-24-2015

Hi john,

To measure the pure decoding performance, simply remove the "-o /dev/null".

With /dev/null, if you noticed the sample_decode output, you would see non-zero fwrite_fps and the decode fps is usually the same as fwrite_fps (I/O bottlenecked). If you remove the "-o" option, the fwrite_fps is 0 and dcode fps is a very high number.

Try this command: ./sample_decode_drm h264 -i big-buck-bunny_294_1280x720.h264

Let us know what you see.

John_H_7 · ‎04-24-2015

Yes, that makes the difference. Now the numbers are comparable to encode.

The elapsed GPU times are about 70% of the FFMPEG times, doing runs for 1,5,10,15 and 20 simultaneous decodes.

And the host CPU use is at 4, 8, 10, 12 and 15% for the decodes above. FFMPEG runs at 99%+ for 5 and above.

Thanks you.

Sravanthi_K_Intel · ‎04-24-2015

Glad we got this sorted out. Keep us posted of your evaluation. Thanks!

Andrew_B_2 · ‎04-25-2015

Do you know which codec you were using when do the benchmark? Was it x264?

John_H_7 · ‎04-27-2015

Yes. This is the FFMPEG codec:

DEV.LS h264 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 (encoders: libx264 libx264rgb )

Just to have you do a sanity check on my GPU results, here's what I have. I'm using the well-know Big Buck Bunny video at 720p - - big-buck-bunny_294_1280x720.mp4: ISO Media, MPEG v4 system, version 1

We are considering adding your GPU to our media server for use in live video conferencing. Thus, encoding/decoding need to be done in real time.

My test system is a Xeon - Intel(R) Xeon(R) CPU E3-1285 v3 @ 3.60GHz, 8 virtual cores, 12 GB memory.

I run a series of decode and then encode tests. Results below. (forgive the formatting, please)

INTEL GPU DECODE Overall Elapsed Time Elapsed Time per Session Average Host CPU Use (%)

1 Session 9 9 4

5 Simultaneous Sessions 49 9.8 8

10 Simultaneous Sessions 95 9.5 10.5

15 Simultaneous Sessions 144 9.6 12

20 Simultaneous Sessions 193 9.65 15

INTEL GPU ENCODE Elapsed Time Elapsed Time per Session Average Host CPU Use (%)

1 Session 66 66 3.6

5 Simultaneous Sessions 167 33.4 7.9

10 Simultaneous Sessions 339 33.9 8.7

15 Simultaneous Sessions 508 33.7 9.4

20 Simultaneous Sessions 674 33.7 9.8

So, if just encoding is done, it looks like 17 simultaneous sessions are the max that can be done to complete in 600 seconds, which is the length of the video being encoded. (10 minutes)

Decoding is better, with about 62 sessions potentially being done in real time.

For both decoding input into a conference and then encoding the output (the real scenario) I figure the GPU can support about 13 simultaneous sessions.

Given the system, does this sound about right?

Thanks.

John

Sravanthi_K_Intel · ‎05-12-2015

Hi John - The numbers for decode and encode are in line with what I am observing on my system (which is similar to yours). yes, the decoder is much much faster than the encoder (as expected). If you have more questions on performance, please send me a message.

Chintan_M_ · ‎01-04-2017

This answer (http://stackoverflow.com/questions/20367326/which-lib-is-better-transcoder-for-live-camera-ffmpeg-vs-intel-media-sdk) here says that there is a trade-off between quality and CPU usage.

How significant is the quality drop for transcoding application during screen capture applications?

Fabrizio · ‎01-07-2017

Hi, if I use only "h264_qsv" option to ffmpeg, which license do I need? I only use ffmpeg to capture http streaming with h264_qsv enabled. What happens after 30 days with the Media Server Studio Community Edition2017?

Thank you

Media SDK Decode vs. FFMPEG Decode