Media (Intel® oneAPI Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools from Intel. This includes Intel® oneAPI Video Processing Library and Intel® Media SDK.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
2904 Discussions

Performance of copy to system memory after hardware decoding

EtienneCo
Beginner
226 Views

Hello, I am trying to use vaapi in gstreamer to perform jpeg decoding in hardware before my application post-processes the jpegs in system memory.

I found the articles about the performance bottleneck of copying from video-memory to system-memory (https://software.intel.com/content/www/us/en/develop/articles/copying-accelerated-video-decode-frame... even though I am not sure how optimized the copy in the gstreamer elements using vaapi is.

For test-purpose, I checked two gstreamer pipelines with several test jpegs on Ubuntu 20.10.
 This pipeline using vaapijpegdec  takes 2.9 seconds to run (I first checked using the iHD driver but it was taking 19 seconds):

user@ark1220-desktop:~/testimages$ export LIBVA_DRIVER_NAME=i965
user@ark1220-desktop:~/testimages$ gst-launch-1.0 -v multifilesrc location="%03d.jpg" index=0 ! jpegparse ! vaapijpegdec ! filesink location=out
Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Got context from element 'vaapidecode_jpeg0': gst.gl.GLDisplay=context, gst.gl.GLDisplay=(GstGLDisplay)"\(GstGLDisplayX11\)\ gldisplayx11-0";
Got context from element 'vaapidecode_jpeg0': gst.vaapi.Display=context, gst.vaapi.Display=(GstVaapiDisplay)"\(GstVaapiDisplayGLX\)\ vaapidisplayglx0";
/GstPipeline:pipeline0/GstJpegParse:jpegparse0.GstPad:src: caps = image/jpeg, parsed=(boolean)true, format=(string)I420, width=(int)688, height=(int)512, framerate=(fraction)1/1
/GstPipeline:pipeline0/GstVaapiDecode_jpeg:vaapidecode_jpeg0.GstPad:sink: caps = image/jpeg, parsed=(boolean)true, format=(string)I420, width=(int)688, height=(int)512, framerate=(fraction)1/1
Redistribute latency...
/GstPipeline:pipeline0/GstVaapiDecode_jpeg:vaapidecode_jpeg0.GstPad:src: caps = video/x-raw, format=(string)NV12, width=(int)688, height=(int)512, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)jpeg, colorimetry=(string)bt601, framerate=(fraction)1/1
/GstPipeline:pipeline0/GstFileSink:filesink0.GstPad:sink: caps = video/x-raw, format=(string)NV12, width=(int)688, height=(int)512, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)jpeg, colorimetry=(string)bt601, framerate=(fraction)1/1
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
Got EOS from element "pipeline0".
Execution ended after 0:00:02.924764261
Setting pipeline to NULL ...
Freeing pipeline ...

 

This other pipeline using software decoding takes 2.06 seconds to run:

$ gst-launch-1.0 -v multifilesrc location="%03d.jpg" index=0 ! jpegparse ! jpegdec ! fakesink

Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
/GstPipeline:pipeline0/GstJpegParse:jpegparse0.GstPad:src: caps = image/jpeg, parsed=(boolean)true, format=(string)I420, width=(int)688, height=(int)512, framerate=(fraction)1/1
/GstPipeline:pipeline0/GstJpegDec:jpegdec0.GstPad:sink: caps = image/jpeg, parsed=(boolean)true, format=(string)I420, width=(int)688, height=(int)512, framerate=(fraction)1/1
/GstPipeline:pipeline0/GstJpegDec:jpegdec0.GstPad:src: caps = video/x-raw, format=(string)I420, width=(int)688, height=(int)512, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)jpeg, colorimetry=(string)1:4:0:0, framerate=(fraction)1/1
/GstPipeline:pipeline0/GstFakeSink:fakesink0.GstPad:sink: caps = video/x-raw, format=(string)I420, width=(int)688, height=(int)512, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)jpeg, colorimetry=(string)1:4:0:0, framerate=(fraction)1/1
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
Got EOS from element "pipeline0".
Execution ended after 0:00:02.062320412
Setting pipeline to NULL ...
Freeing pipeline ...

 

Is this approximately the expected performance? (that copy from video-memory to system-memory takes as much time as doing the decoding itself). Or is this level of performance unexpected and the gstreamer pipeline should be optimized? This is on an E3940 CPU.

Thanks a lot
Etienne

0 Kudos
0 Replies
Reply