Community
cancel
Showing results for 
Search instead for 
Did you mean: 
sam_p_
Beginner
48 Views

Significantly reduced performance when using 2 threads for two streams instead of 1 thread for 1 stream

Processor Type: 4790S
Driver Version: What driver?
Operating System: Ubuntu 12.04
Media SDK System Analyzer: What is this?
Quick Reproducer Code: 
Concise Description of the Issue: I get less than 45 fps per stream when doing 2 streams but over 120 on a single stream. 
Priority: Medium
Input File:
Tracer log(if required):

When I do the following

MFXCloneSession(session,&session2);
//Largely dupicated setup of session2
MFXJoinSession(session,session2);

and run both sessions I get at most 40 frames per second, but if i don't run "session2" I get 190 fps, Obviously 40*2 is 80 not 190, what could cause this non-linear performance decrease?

0 Kudos
2 Replies
sam_p_
Beginner
48 Views

Asked another way, what initialization do i have to do to a session I plan on joining?  what does "After joining, the two sessions share thread and resource scheduling for asynchronous operations. "  do for me?

Nina_K_Intel
Employee
48 Views

Hi Sam,

Sorry for late reply. When sessions are joined they have common tasks scheduler and thread pool. As a consequence it allows to pass frame surfaces between components residing in different sessions without explicit synchronization (i.e. without SyncOperation which waits for task to complete) and thus build asynchronous pipeline of functions residing in different sessions (asynchronous execution is better for performace). It is especially useful when you want to build complex pipelines like 1 decode -> 2 vpp + encode etc.

Additionally, having one thread pool allows to reduce memory usage slightly. Also with SW processing joined sessions allow to avoid thread oversubscription and improve performance. But with HW accelerated processing joining sessions does not have any performance impact. 

So, looking at your result, I agree that it is pretty unexpected. Could you please give us more details like what are these sessions doing - which input/output codecs and resolutions, what's the topology - is it simple (one in -> one out transcoding) x 2 or something different?  If these sessions use same frame surface pool - are you allocating enough surfaces to let both sessions operate without waiting for a surface to free up? 

Probably the best thing to start with is to run sample_multi_transcode app from our samples package and see if it gives the same strange results. It can run 1 or several sessions of transcoding. It also has an option to join the sessions. 

Regards,

Nina

Reply