- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
First, I am sorry for my poor English.
I am testing video conferencing by ffmpeg. I believed that HW encoder must be faster than SW encoder. So, I try to use qsv to reduce processing time in encoding h264.
I tested 2 cases.
First test is encoding raw_video data from file system to h264 in file.
- RawVideo(yuv422p) -> H264
- Total 500 frames
- SW encoder(libx264) takes 2.483 sec for encoding all frames.
- HW encoder(h264_qsv) takes 0.682 sec.
The result is very good as I expected. HW encoder is mush better(4x).
Following is options for ffmepg I used.
SW encoder : time ~/ffmpeg_build/bin/ffmpeg -loglevel verbose -pix_fmt yuyv422 -video_size 640x480 -f rawvideo -i ./640x480dump.raw -f avi -c:v h264 ./640x480dump.avi
HW encoder : time ~/ffmpeg_build/bin/ffmpeg -loglevel verbose -pix_fmt yuyv422 -video_size 640x480 -f rawvideo -i ./640x480dump.raw -f avi -c:v h264_qsv ./640x480dump.avi
Second test is streaming webcam to udp packet.
- SW encoder(libx264) takes 210ms in latency from webcam to displaying it on monitor of receiver.
- HW encoder(h264_qsv) takes 260ms in latency.
In this case, HW encoder is slow than SW encoder. I wonder if Quick Sync HW encoder is not suitable for video conferencing or something which needs low latency. following is options I used and log text.
SW encoder(libx264) : ffmpeg -loglevel verbose -input_format yuv422p -video_size 640x480 -framerate 30 -f v4l2 -i /dev/video2 -preset ultrafast -tune zerolatency -f h264 -c:v libx264 udp://127.0.0.1:20001
HW encoder : ffmpeg -loglevel verbose -input_format yuv422p -video_size 640x480 -framerate 30 -f v4l2 -i /dev/video2 -preset veryfast -scenario videoconference -async_depth 1 -int_ref_cycle_dist 1 -f h264 -c:v h264_qsv udp://127.0.0.1:20001
Following is log text for HW encoder
-------------------------------------------------------------------------------------------------
ffmpeg version 6.0 Copyright (c) 2000-2023 the FFmpeg developers
built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
configuration: --prefix=/home/highvolt/ffmpeg_build pkg-config-flags=--static --extra-libs='-lpthread -lm' --ld=g++ --enable-gpl --enable-gnutls --enable-libfreetype --enable-libx264 --enable-libvpl --enable-nonfree
libavutil 58. 2.100 / 58. 2.100
libavcodec 60. 3.100 / 60. 3.100
libavformat 60. 3.100 / 60. 3.100
libavdevice 60. 1.100 / 60. 1.100
libavfilter 9. 3.100 / 9. 3.100
libswscale 7. 1.100 / 7. 1.100
libswresample 4. 10.100 / 4. 10.100
libpostproc 57. 1.100 / 57. 1.100
[video4linux2,v4l2 @ 0x5604a3b50100] fd:3 capabilities:84a00001
Input #0, video4linux2,v4l2, from '/dev/video2':
Duration: N/A, start: 42298.683949, bitrate: 147456 kb/s
Stream #0:0: Video: rawvideo, 1 reference frame (YUY2 / 0x32595559), yuyv422, 640x480, 147456 kb/s, 30 fps, 30 tbr, 1000k tbn
Stream mapping:
Stream #0:0 -> #0:0 (rawvideo (native) -> h264 (h264_qsv))
Press [q] to stop, [?] for help
[graph 0 input from stream 0:0 @ 0x5604a3b55bc0] w:640 h:480 pixfmt:yuyv422 tb:1/1000000 fr:30/1 sar:0/1
[auto_scale_0 @ 0x5604a3b6ca80] w:iw h:ih flags:'' interl:0
[format @ 0x5604a3b6a7c0] auto-inserting filter 'auto_scale_0' between the filter 'Parsed_null_0' and the filter 'format'
[auto_scale_0 @ 0x5604a3b6ca80] w:640 h:480 fmt:yuyv422 sar:0/1 -> w:640 h:480 fmt:nv12 sar:0/1 flags:0x00000004
[h264_qsv @ 0x5604a3b54940] Encoder: input is system memory surface
[h264_qsv @ 0x5604a3b54940] Use Intel(R) oneVPL to create MFX session, the required implementation version is 1.1
[AVHWDeviceContext @ 0x5604a3d8bc00] Trying to use DRM render node for device 0.
[AVHWDeviceContext @ 0x5604a3d8bc00] libva: VA-API version 1.18.0
[AVHWDeviceContext @ 0x5604a3d8bc00] libva: User requested driver 'iHD'
[AVHWDeviceContext @ 0x5604a3d8bc00] libva: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
[AVHWDeviceContext @ 0x5604a3d8bc00] libva: Found init function __vaDriverInit_1_18
[AVHWDeviceContext @ 0x5604a3d8bc00] libva: va_openDriver() returns 0
[AVHWDeviceContext @ 0x5604a3d8bc00] Initialised VAAPI connection: version 1.18
[AVHWDeviceContext @ 0x5604a3d8bc00] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 23.1.6 ().
[AVHWDeviceContext @ 0x5604a3d8bc00] Driver not found in known nonstandard list, using standard behaviour.
[h264_qsv @ 0x5604a3b54940] Initialized an internal MFX session using hardware accelerated implementation
[h264_qsv @ 0x5604a3b54940] Using the variable bitrate (VBR) ratecontrol method
[h264_qsv @ 0x5604a3b54940] profile: avc high; level: 30
[h264_qsv @ 0x5604a3b54940] GopPicSize: 256; GopRefDist: 3; GopOptFlag: closed; IdrInterval: 0
[h264_qsv @ 0x5604a3b54940] TargetUsage: 7; RateControlMethod: VBR
[h264_qsv @ 0x5604a3b54940] BufferSizeInKB: 375; InitialDelayInKB: 187; TargetKbps: 1000; MaxKbps: 1500; BRCParamMultiplier: 1
[h264_qsv @ 0x5604a3b54940] NumSlice: 1; NumRefFrame: 2
[h264_qsv @ 0x5604a3b54940] RateDistortionOpt: OFF
[h264_qsv @ 0x5604a3b54940] RecoveryPointSEI: OFF
[h264_qsv @ 0x5604a3b54940] VDENC: OFF
[h264_qsv @ 0x5604a3b54940] Entropy coding: CABAC; MaxDecFrameBuffering: 2
[h264_qsv @ 0x5604a3b54940] NalHrdConformance: ON; SingleSeiNalUnit: ON; VuiVclHrdParameters: OFF VuiNalHrdParameters: ON
[h264_qsv @ 0x5604a3b54940] FrameRateExtD: 1; FrameRateExtN: 30
[h264_qsv @ 0x5604a3b54940] IntRefType: 0; IntRefCycleSize: 0; IntRefQPDelta: 0
[h264_qsv @ 0x5604a3b54940] MaxFrameSize: 230400; MaxSliceSize: 0
[h264_qsv @ 0x5604a3b54940] BitrateLimit: ON; MBBRC: OFF; ExtBRC: OFF
[h264_qsv @ 0x5604a3b54940] Trellis: auto
[h264_qsv @ 0x5604a3b54940] RepeatPPS: OFF; NumMbPerSlice: 0; LookAheadDS: 2x
[h264_qsv @ 0x5604a3b54940] AdaptiveI: OFF; AdaptiveB: OFF; BRefType:off
[h264_qsv @ 0x5604a3b54940] MinQPI: 0; MaxQPI: 0; MinQPP: 0; MaxQPP: 0; MinQPB: 0; MaxQPB: 0
[h264_qsv @ 0x5604a3b54940] DisableDeblockingIdc: 0
[h264_qsv @ 0x5604a3b54940] SkipFrame: no_skip
[h264_qsv @ 0x5604a3b54940] PRefType: default
[h264_qsv @ 0x5604a3b54940] TransformSkip: unknown
[h264_qsv @ 0x5604a3b54940] IntRefCycleDist: 1
[h264_qsv @ 0x5604a3b54940] LowDelayBRC: OFF
[h264_qsv @ 0x5604a3b54940] MaxFrameSizeI: 0; MaxFrameSizeP: 0
[h264_qsv @ 0x5604a3b54940] ScenarioInfo: 2
Output #0, h264, to 'udp://127.0.0.1:20001':
Metadata:
encoder : Lavf60.3.100
Stream #0:0: Video: h264, 1 reference frame, nv12(tv, progressive), 640x480 (0x0), q=2-31, 1000 kb/s, 30 fps, 30 tbn
Metadata:
encoder : Lavc60.3.100 h264_qsv
Side data:
cpb: bitrate max/min/avg: 0/0/1000000 buffer size: 0 vbv_delay: N/A
frame= 1835 fps= 30 q=15.0 size= 7223kB time=00:01:01.13 bitrate= 968.0kbits/s speed=0.999x
-------------------------------------------------------------------------------------------------
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thank you for posting in Intel Communities. We are working on this internally and we will get back to you with an update.
Thanks,
Alekhya
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We apologize for the delay caused. We were able to encode a webcam video to UDP packet with & without Quick-Sync just like you did as follows.
Software Encoder:
Hardware Encoder:
To understand your issue better, we would like to know how you're calculating the latency and some more information:
- Kernel version & processor details in which you're trying to reproduce this issue.
- Could you please let us know the steps/formula you used to calculate latency from webcam to displaying on Monitor.
- One more quick doubt on this. Did you mean an external device(example: monitor, tv, etc.) when you meant displaying on monitor? Or did you mean on your computer/laptop screen?
Regards,
Alekhya
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
1. How to measure the latency
1.1 I executed a simple timer application which displays current timestamp on my monitor.
1.2 I captured the timestamp by my webcam and displayed the captured picture on my monitor too.
1.3 And, I took a picture my phone camera both the timestamp and the captured picture.
1.4 I recognized the difference between the timestamp and another timestamp on the captured picture as latency.
1.5 Please look at test_way.jpg file
2. Player
I used ffplay to display for udp packets. Following is the options I used.
~/ffmpeg_build/bin/ffplay -fflags nobuffer -flags low_delay -framedrop -vcodec h264 udp://127.0.0.1:20001
ffplay version 6.0 Copyright (c) 2003-2023 the FFmpeg developers
built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
configuration: --prefix=/home/highvolt/ffmpeg_build pkg-config-flags=--static --extra-libs='-lpthread -lm' --ld=g++ --enable-gpl --enable-gnutls --enable-libfreetype --enable-libx264 --enable-libvpl --enable-libv4l2 --enable-nonfree
libavutil 58. 2.100 / 58. 2.100
libavcodec 60. 3.100 / 60. 3.100
libavformat 60. 3.100 / 60. 3.100
libavdevice 60. 1.100 / 60. 1.100
libavfilter 9. 3.100 / 9. 3.100
libswscale 7. 1.100 / 7. 1.100
libswresample 4. 10.100 / 4. 10.100
libpostproc 57. 1.100 / 57. 1.100
3. Information
3.1 My laptop
3.1.1 Model : Dell Inspiron 15 7570
3.1.2 CPU : See cpu_info.txt
3.1.3 Memory : See mem_info.txt
3.1.4 GPU :
*-display
description: VGA compatible controller
product: UHD Graphics 620 (Whiskey Lake)
vendor: Intel Corporation
physical id: 2
bus info: pci@0000:00:02.0
version: 00
width: 64 bits
clock: 33MHz
capabilities: pciexpress msi pm vga_controller bus_master cap_list rom
configuration: driver=i915 latency=0
resources: irq:145 memory:a4000000-a4ffffff memory:80000000-8fffffff ioport:5000(size=64) memory:c0000-dffff
*-display
description: 3D controller
product: GP108M [GeForce MX150]
vendor: NVIDIA Corporation
physical id: 0
bus info: pci@0000:01:00.0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress bus_master cap_list rom
configuration: driver=nouveau latency=0
resources: irq:146 memory:a2000000-a2ffffff memory:90000000-9fffffff memory:a0000000-a1ffffff ioport:4000(size=128) memory:a3000000-a307ffff
3.2 Devices
3.2.1 Monitor : Dell S2340Lc
3.2.2 Webcam : Logitech logi HD1080p
3.3 OS
Ubuntu 20.04
Linux version 5.15.0-79-generic (buildd@lcy02-amd64-014) (gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #86~20.04.2-Ubuntu SMP Mon Jul 17 23:27:17 UTC 2023
Please tell me anything I missed you need.
Thank you.
cglee
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thank you for sharing all the details that we have requested. We have contacted the admin team regarding this issue and would get back to you soon with an update.
Thanks,
Alekhya
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We got an update from the admin team. According to the information, the output format from your camera is yuy2 / yuyv (yuyv422 in FFmpeg), however Intel HW (vme mode) doesn't support h264 encoding with yuy2/yuyv format, Intel HW (vme mode) supports NV12 only for h264 encoding, please refer to https://github.com/intel/media-driver/blob/master/docs/media_features.md#hardwarepak--shadermedia-kernelvme-encoding.
In addition, HW encoder accepts data in gfx memory, however the data from your camera is in system memory.
Some text with a title
Input #0, dshow, from 'video=USB Video Device': Duration: N/A, start: 31681.955346, bitrate: N/A Stream #0:0: Video: rawvideo, 1 reference frame (YUY2 / 0x32595559), yuyv422(tv, bt470bg/bt709/unknown, topleft), 640x480, 30 fps, 30 tbr, 10000k tbn
The command with HW encoder does 2 more things than the one with sw encoder:
- Convert yuy2 to nv12, which is done in FFmpeg
- Upload data from system memory to gfx memory, which is done in oneVPL GPU runtime.
So it is possible that the command with hw encoder is slow than the command with sw encoder (Note it is not hw encoder vs. sw encoder)
You may use hwupload and vpp_qsv to speed up the command with hw encoder, please refer to the command below
ffmpeg -y -init_hw_device qsv -loglevel verbose -f lavfi -i yuvtestsrc=size=640x480,format=yuyv422 -vf "hwupload=extra_hw_frames=16,vpp_qsv=format=nv12" -preset veryfast -scenario videoconference -async_depth 1 -int_ref_cycle_dist 1 -f h264 -c:v h264_qsv qsv.mp4
If this resolves your issue, make sure to accept this as solution. This helps others with similar queries.
Thanks,
Alekhya
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Alekhya for your reply.
During last several weeks, I have tried to use qsv(media sdk) directly without ffmpeg or libav.
Finally, I found a solution for reducing the latency. Your library may use a number of surface frames to which next frames refer. And your library should have 2 frames at least when encoding. It causes a latency for 2 frames time in mandatory, which is usually 66ms or more(30fps). There are no official options to remove the latency for the mandatory frames except only some tricking combinations.
I referred this thread. I hope that it helps someone facing this latency problem.
Thank you
cglee
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Glad to know that your issue is resolved. If you need any further assistance, please post a new question as this thread will no longer be monitored by Intel.
Regards,
Alekhya
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page