Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.
3058 Discussions

Why hardware encoder is slower with video memory than with system memory?

wenquan__po
Beginner
538 Views

Hi!

 

  • I run sample_encode using system memory with these options:

sample_encode.exe h264 -i D:\crowd_run_2160p50.yuv -o D:\crowd_run_2160p50.h264 -w 3840 -h 2160 -r 1 -async 1 -gpb:off -g 600 -x 1 -b 4000 -cbr -hw

The result is:

Frame number: 500
Encoding fps: 17

Encode one frame latency:  36.57ms

 

  • And then I run sample_encode using video memory with these options:

sample_encode.exe h264 -i D:\crowd_run_2160p50.yuv -o D:\crowd_run_2160p50.h264 -w 3840 -h 2160 -r 1 -async 1 -gpb:off -g 600 -x 1 -b 4000 -cbr -d3d11 -hw

The result is:

Frame number: 500
Encoding fps: 8

Encode one frame latency: 65.25 ms

 

As docs say,using video memory with hardware encoder can gain best performance,but the results listed above dosen't match what the docs say.

 

  • My computer's info:

Windows 10 1903

i7-7700CPU @3.60GHz 3.60GHz.

 

Could anyone explain this for me?

 

Thanks!

 

0 Kudos
1 Solution
Dmitry_E_Intel
Employee
538 Views

Hi!

 

Let me explain. Underlying HW always uses video memory, it simply does't know the system memory. So conversion system->video memory always present in the pipeline somewhere:

- when you run sample_encode with video memory you on sample side call Lock/Map (of D3D9/D3D11 interfaces) input surface, write data to it (e.g. from YUV file on a disk), call Unlock/Unmap. system->video copy/conversion happens here.

- when you run sample_encode with system memory, you fill surface in system memory and pass it MediaSDK encoder. Then the MSDK internally makes system->video copy and it may using internal optimizations (like GPUCopy). Plus please pay attentions that you read pretty big YUV from disk. Using system memory, you indirectly introduce some asynchronous processing here: while you load current frame from YUV file to system memory, MSDK can do system->video copy for previous frame. This also can give some performance benefits. 

 

BTW, you can double check impact of YUV file reading by "-perf_opt <value> -n <N>" options. They will preload a number (<value>) of frames from YUV file to video memory, and encode them N times. I'm not sure if these options already were in last Windows MediaSDK release, but they are available in sample_encoder from GitHub: https://github.com/Intel-Media-SDK/MediaSDK/tree/master/samples/sample_encode

 

View solution in original post

0 Kudos
1 Reply
Dmitry_E_Intel
Employee
539 Views

Hi!

 

Let me explain. Underlying HW always uses video memory, it simply does't know the system memory. So conversion system->video memory always present in the pipeline somewhere:

- when you run sample_encode with video memory you on sample side call Lock/Map (of D3D9/D3D11 interfaces) input surface, write data to it (e.g. from YUV file on a disk), call Unlock/Unmap. system->video copy/conversion happens here.

- when you run sample_encode with system memory, you fill surface in system memory and pass it MediaSDK encoder. Then the MSDK internally makes system->video copy and it may using internal optimizations (like GPUCopy). Plus please pay attentions that you read pretty big YUV from disk. Using system memory, you indirectly introduce some asynchronous processing here: while you load current frame from YUV file to system memory, MSDK can do system->video copy for previous frame. This also can give some performance benefits. 

 

BTW, you can double check impact of YUV file reading by "-perf_opt <value> -n <N>" options. They will preload a number (<value>) of frames from YUV file to video memory, and encode them N times. I'm not sure if these options already were in last Windows MediaSDK release, but they are available in sample_encoder from GitHub: https://github.com/Intel-Media-SDK/MediaSDK/tree/master/samples/sample_encode

 

0 Kudos
Reply