Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

Xeon vs i3 performance anomalies

Andrew_J_
Beginner
573 Views

Hi,

We have three platforms below all running the same configuration - stock CentOS 7.4 with Media Server Studio 2018 R2.

- Xeon E3-1585v5 (Supermicro X11SSH-GF-1585)
- Xeon E3-1285v4 (Supermicro X10SLH-F)
- Core i3-6100 (Supermicro X11SSH-LN4F)

We are experiencing some strange performance stats when comparing transcoding throughput.  All testing below has been completed using sample_multi_transcode with the following options:

$ sample_multi_transcode -i::h264 test_es.264 -async 4 -o::h264 out.264 -b target_bitrate

What we are finding is that performance with the 1285v5 with smaller frame sizes is much lower than we would have expected, and in most cases is beaten by the Broadwell Xeon and the Skylake i3.  Figures below are for a single instance of sample_multi_transcode.

240p
1585v5 = 1270 fps
1285v4 = 1614 fps
i3-6100 = 1560 fps

576p
1585v5 = 371 fps
1285v4 = 413 fps
i3-6100 = 381 fps

720p
1585v5 = 528 fps
1285v4 = 558 fps
i3-6100 = 466 fps

1080p
1585v5 = 312 fps
1285v4 = 328 fps
i3-6100 = 257 fps

2160p
1585v5 = 111 fps
1285v4 = 106 fps
i3-6100 = 79 fps

The only time the 1585v5 comes out on top (and not by much) is when processing UHD frames.  Running our own transcoding application which uses the SDK directly gives us very similar results, as does executing multiple transcoding jobs in parallel.  Given the larger number of EUs we were expecting to see a much higher level of performance across the board with the v5 Xeon than we seem to be achieving.  Are we missing something or is this expected behavior?


Thanks in advance,

Andrew
 

0 Kudos
1 Reply
Mark_L_Intel1
Moderator
573 Views

Hi Andrew,

I check some validation results test case with one input stream fan out to multiple output stream we have and I think this should not be the expected result.

From your test, it seems using a 1 to 1 validation on sample_multi_transcode, can you confirm?

There are also cases which certain content has different impact to the test, what's kind of content are you using?

I also looked at the performance on different platform and I didn't see a big difference between Core and Xeon.

Please also be carefully on the test configurations, make sure the clip is long enough since the first 10 seconds will be the time to set up the system. In this context, the longer you run, the stable the environment is.

Please also change "-async" to 1 and set "-hw" explicitly.

Mark

0 Kudos
Reply