Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

How to know if GPU can handle required streams load ?

rshal2
New Contributor II
576 Views

Hello,

I've run the encode & decode demo, in our platform.

Now we wanted to do a simple test to check if system can handle the required load.

It should handle load of 6 multiple concurrent HD streams.

Now, I though that in order to test this I can run sample_encode_drm 6 times each time as different process. Is that a good way to do this test ? I think it is equivalent to 6 sessions running in paraller, Right ?

Another thing I though of is using the transcode sample with 6 sessions, although from what I read each session will do both encode and decode, so  I guess it will result in higher gpu usage than in our real scenario.

The problem is that the demo runs on files, not on real live video, so how can I know that with live video there will be no frame drops etc. Should I use some Analyzer or the metric monitor ?

Thank you,

Ran

0 Kudos
7 Replies
Surbhi_M_Intel
Employee
576 Views

Hi Ran, 

You can approach this problem in couple of ways as you suggested above. I think the best way to test performance on your underlying platform is to use sample_multi_transcode. Sample is designed to do N:N transcode model i.e. N decodes and N encode and 1:N transcode model i.e 1 decode and N encode sessions which I believe is close to what you want to do since you want to test this for live video which will be encoded stream instead of yuv. 
Currently sample is designed to read from a file so to create a live streaming demo you can change the read part from the sample code to read a live stream instead of yuv. 

Hope it helps. 

Thanks,
Surbhi


 

0 Kudos
rshal2
New Contributor II
576 Views

Dear Surbhi,

Thank you very much for the valuable information.

I would please like to ask:

1. Do you have any suggestion/ideas as to how to simulate live video reading ?

2. Does metric monitor can help in evaluation of GPU capability for our purpose (6 encode, 1 decode pipeline) ? I assume that reading from YUV file should be similiar to read from live video source. But I am not sure how to interpret the metric monitor results. I assumed that the best result is to see low values , but reading the following thread

https://software.intel.com/en-us/forums/intel-media-sdk/topic/558123

I see according to the link that using async=1 gets lower cpu usage ? but In user guide It is said that as async gets higher then performance should be better. So, I'm not sure I understand.

Thank you very much,

Ran

 

0 Kudos
rshal2
New Contributor II
576 Views

Hi Surbhi,

One more question if I may.

About transcode multi demo, do you think I can change it to support multiple encoding from yuv files. As I understand the current demo does not support it (only multiple transocoding).

Regards,

Ran

0 Kudos
Surbhi_M_Intel
Employee
576 Views

Hi Ran,

Metrics monitor is only available on Linux and not on windows system. Metrics monitor can definitely help in interpreting what's happening on your GPU stack with respect to video encoding, decoding and processing. The important parameter to check is the video usage, check metrics description table in the metrics monitor manual - https://software.intel.com/sites/default/files/managed/0b/5a/metricsmon-man.pdf. So, for the 1 decode and 6 encode concurrent session, check the usage of video and render usage. 
For async depth, the factor plays major role when there is a single transcode/encode session. higher value of async depth means more no. of asynchronous pipelines before calling the sync operation which improves the performance by putting the system under maximum conditions in case of a single stream transcode. In depth discussion over async and join operations is explained in 
https://software.intel.com/en-us/articles/aync-and-join-operation-in-media-sdk-multi-transcoding. I will get back to you on suggestion for setting live video streaming to test frame drops. 

Thanks,
Surbhi

 

0 Kudos
rshal2
New Contributor II
576 Views

Hi Surnhi,

Thank you very much for the helpful information.

We are using media SDK server (Linux , CentOs)

The golden rule when investigating metrics monitor is : the lower gpu usage we have - the better , i.e. : gpu does its jobs, and still have idle time for doing nothing.

The only thing is that metric monitor does not validate the requirements are handled, it just gives gpu usage. In other words, how do I know if frames get dropped , becuase gpu was busy doing something else , or even becuase it decided to sleep instead of doing its job  ?

I validate it just becuase of the strange behaviour mentioned in another thread in which using larger async depth -->> resulted in larget cpu usage ! 

https://software.intel.com/en-us/forums/intel-media-sdk/topic/558123

Regards,

Ran

0 Kudos
Surbhi_M_Intel
Employee
576 Views

Metrics monitor is specifically for media GPU related stats i.e. MFX or FF logic. Often it's hard to find such a tool for linux to monitor such stats on GPU. It doesn't tell you overall GPU usage or which functions are occupying the GPU.  It seems what you are looking for is Vtune Amplifier which could explicity tell which GPU functions are being occupied, whether it was handling tasks other than media decode/encode. IT is a part of Media Server Studio Pro version, you can download the evaluation edition to evaluate if it gives you enough information you are looking for. Check their website and documentation to find more details - https://software.intel.com/en-us/intel-vtune-amplifier-xe. I don't believe that it has capability to look for frame drops, you can write a own piece of code to incorporate that check in  your application. 

Thanks,
Surbhi 

 

 

0 Kudos
rshal2
New Contributor II
576 Views

Hi, 

If I understand correctly, Metrics monitor shows gpu usage in more general category (4 categories), while Vtune Amplifier gives more detials/information on usage in each HW block. It seems that metrics monitor is good enough for understand the gpu usage.
I think that it just need to be operated in the exact scnearios that we required (live video instead of files) :

I tested it with transcode sample with only 2 hd files (1920x1080) and it shows 100% (!) in both VIDEO2 and VIDEO.
I think that since our scenario is concerned with live video - not files -  than we better check with live video instead. I suspect that media sdk tries to keep the HW busy in order to achieve best performance (-> and shortest timing), and this brings the gpu usage to its 100%. 
I think that I better add capability in demo for reading live video device, and I hope that I will than see that our HW  is capable of encoding the streams without frames drops, and that gpu usage with metric monitor is below 100%.

for the live video capability I already have some idea (starting virtual device driver, vivi which output colorbar, and reading using v4l2 APIs), I hope that this can be added to the demo, in order to ease the testing.

About frames drop, how does media sdk shows this parameter ? 

Thank you,

Ran 

0 Kudos
Reply