Media (Intel® oneAPI Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools from Intel. This includes Intel® oneAPI Video Processing Library and Intel® Media SDK.
Announcements
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.
2935 Discussions

Why hardware encoder is slower with video memory than with system memory?

wenquan__po
Beginner
296 Views

Hi!

 

  • I run sample_encode using system memory with these options:

sample_encode.exe h264 -i D:\crowd_run_2160p50.yuv -o D:\crowd_run_2160p50.h264 -w 3840 -h 2160 -r 1 -async 1 -gpb:off -g 600 -x 1 -b 4000 -cbr -hw

The result is:

Frame number: 500
Encoding fps: 17

Encode one frame latency:  36.57ms

 

  • And then I run sample_encode using video memory with these options:

sample_encode.exe h264 -i D:\crowd_run_2160p50.yuv -o D:\crowd_run_2160p50.h264 -w 3840 -h 2160 -r 1 -async 1 -gpb:off -g 600 -x 1 -b 4000 -cbr -d3d11 -hw

The result is:

Frame number: 500
Encoding fps: 8

Encode one frame latency: 65.25 ms

 

As docs say,using video memory with hardware encoder can gain best performance,but the results listed above dosen't match what the docs say.

 

  • My computer's info:

Windows 10 1903

i7-7700CPU @3.60GHz 3.60GHz.

 

Could anyone explain this for me?

 

Thanks!

 

0 Kudos
1 Solution
Dmitry_E_Intel
Employee
296 Views

Hi!

 

Let me explain. Underlying HW always uses video memory, it simply does't know the system memory. So conversion system->video memory always present in the pipeline somewhere:

- when you run sample_encode with video memory you on sample side call Lock/Map (of D3D9/D3D11 interfaces) input surface, write data to it (e.g. from YUV file on a disk), call Unlock/Unmap. system->video copy/conversion happens here.

- when you run sample_encode with system memory, you fill surface in system memory and pass it MediaSDK encoder. Then the MSDK internally makes system->video copy and it may using internal optimizations (like GPUCopy). Plus please pay attentions that you read pretty big YUV from disk. Using system memory, you indirectly introduce some asynchronous processing here: while you load current frame from YUV file to system memory, MSDK can do system->video copy for previous frame. This also can give some performance benefits. 

 

BTW, you can double check impact of YUV file reading by "-perf_opt <value> -n <N>" options. They will preload a number (<value>) of frames from YUV file to video memory, and encode them N times. I'm not sure if these options already were in last Windows MediaSDK release, but they are available in sample_encoder from GitHub: https://github.com/Intel-Media-SDK/MediaSDK/tree/master/samples/sample_encode

 

View solution in original post

1 Reply
Dmitry_E_Intel
Employee
297 Views

Hi!

 

Let me explain. Underlying HW always uses video memory, it simply does't know the system memory. So conversion system->video memory always present in the pipeline somewhere:

- when you run sample_encode with video memory you on sample side call Lock/Map (of D3D9/D3D11 interfaces) input surface, write data to it (e.g. from YUV file on a disk), call Unlock/Unmap. system->video copy/conversion happens here.

- when you run sample_encode with system memory, you fill surface in system memory and pass it MediaSDK encoder. Then the MSDK internally makes system->video copy and it may using internal optimizations (like GPUCopy). Plus please pay attentions that you read pretty big YUV from disk. Using system memory, you indirectly introduce some asynchronous processing here: while you load current frame from YUV file to system memory, MSDK can do system->video copy for previous frame. This also can give some performance benefits. 

 

BTW, you can double check impact of YUV file reading by "-perf_opt <value> -n <N>" options. They will preload a number (<value>) of frames from YUV file to video memory, and encode them N times. I'm not sure if these options already were in last Windows MediaSDK release, but they are available in sample_encoder from GitHub: https://github.com/Intel-Media-SDK/MediaSDK/tree/master/samples/sample_encode

 

Reply