Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

H264 Decoding performance on Linux.


I'm porting a windows transcoder to Linux, and running into an issue on the speed of decoding a 1080p H.264 video frame - the encoding works as expected.  Comparing the same feed from a low end Pentium Linux (N4200 - 1.1 GHz) to my Windows developer machine (i5-6500 - 3.2 GHz), the encode takes about 2x longer (as expected), but the decode takes 20x longer. (1+ milliseconds compared to over 26 milliseconds).  I'm wondering if there is something obvious I am doing wrong, or if this CPU isn't fully supported, or what ?

I'm running Ubuntu 19.10 server.

CPU: Intel(R) Pentium(R) CPU N4200 @ 1.10GHz

Installed the mfx packages: sudo apt-get -y install libmfx1 libmfx-tools libva-drm2 vainfo intel-media-va-driver-non-free
Running: I see

libva info: VA-API version 1.5.0
libva info: va_getDriverName() returns 0
libva info: User requested driver 'iHD'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/
libva info: Found init function __vaDriverInit_1_5
libva info: va_openDriver() returns 0
awk: fatal: cannot open file `/etc/network/interfaces' for reading (No such file or directory)

As to my code: I'm opening up the VA display (_vaDisplay is shared between my encode/decode sessions)

    int _card = open("/dev/dri/card0", O_RDWR); /* primary card */
    VADisplay _vaDisplay = vaGetDisplayDRM(_card);
    VAStatus vaStatus = vaInitialize(_vaDisplay, &majorVersion, &minorVersion);

Opening up my session:

    mfxVersion min_version = { 3, 1 };  // min version 1.3 to support Low Latency.
    mfxStatus initStatus = MFXInit(MFX_IMPL_HARDWARE_ANY, &min_version, &session);
    initStatus = MFXVideoCORE_SetHandle(session, MFX_HANDLE_VA_DISPLAY, (mfxHDL)_vaDisplay);  // added for Linux

And the basic decode which I timed (obviously I do the DecodeHeader before, and necessary surface creation)

        sts = MFXVideoDECODE_DecodeFrameAsync(_session, &bitstream, inputFrame, &surface, &syncPoint);
        sts = MFXVideoCORE_SyncOperation(_session, syncPoint, MSDK_DEC_WAIT_INTERVAL);

0 Kudos
1 Solution

Hi Michael,

25 ms for decoder is a lot. There must be something wrong. A few things worth checking:

1) Please check behavior with "default" MSDK samples. E.g. does sample_decode demonstrate the same low perf issue?  Please check with and without "-vaapi" option.

2) There is a known issue on APL (N4200) - video to system memory conversion takes much time (there still a way to mitigate this - use GPUCopy). However for transcoding scenarios, there should no such conversion because correctly written application should set IOPattern to OUT_VIDEO/IN_VIDEO both decoder and encoder.

BTW, please feel free to submit an issue at GitHub: . I'm checking update on Intel Developer Zone rarely. 




View solution in original post

0 Kudos
1 Reply

Hi Michael,

25 ms for decoder is a lot. There must be something wrong. A few things worth checking:

1) Please check behavior with "default" MSDK samples. E.g. does sample_decode demonstrate the same low perf issue?  Please check with and without "-vaapi" option.

2) There is a known issue on APL (N4200) - video to system memory conversion takes much time (there still a way to mitigate this - use GPUCopy). However for transcoding scenarios, there should no such conversion because correctly written application should set IOPattern to OUT_VIDEO/IN_VIDEO both decoder and encoder.

BTW, please feel free to submit an issue at GitHub: . I'm checking update on Intel Developer Zone rarely. 




0 Kudos