Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

Parallel execution of networks... what's the best option?

Poca__Ramon
Beginner
1,043 Views

Hi there,

I have a python runtime that feeds from a camera and runs face detection first and extracts characteristics on every face.

For that second phase, I would like to execute that second network(s) in parallel to speed up frame rate. My question is:

Can a single ExecutableNetwork be called from multiple threads or it will lock and serialize the processing? Is it better to instantiate several copies of the network for parallel processing?

 

0 Kudos
1 Solution
Luis_at_Intel
Moderator
1,043 Views

Thanks for the clarification. I believe there are some parallelization mentioned in the documentation for the CPU plugin usage (since you mentioned having 4 cores) and specific optimizations for the device. I am not sure if spawning threads from Python would speed up your application, I'd suggest to take a look at the supported configuration parameters to better utilize CPU-specific settings. 

But to better answer your question, I don't think the exec_network locks. For more information please check the ExecutableNetwork class reference in the documentation. Hope this answers your question!

 

Regards,

Luis

View solution in original post

0 Kudos
4 Replies
Luis_at_Intel
Moderator
1,043 Views

Hi Poca, Ramon,

Not sure I understand what your problem statement is. Do you want to run the same network x2 times to speed up frame rate? Send multiple infer requests using the same network multiple times? 

There is a gender_age sample program in the NCAPPZOO for OpenVINO where two networks are used with multiple MYRIAD plugins, but I believe is the same concept given by the Multi-device plugin. You can leverage the use of Multi-Device plugin to assign inference requests to any available devices to execute those requests in parallel (if that is your problem statement). For example, if you have a CPU/GPU and multiple MYRIAD devices (NCS) or HDDL. I'd suggest also to take a look at the Optimization Guide, more specifically this section that discusses performance aspects of multiple requests simultaneously for a better understanding. Hope this answers your question.

Best Regards,

Luis

 

0 Kudos
Poca__Ramon
Beginner
1,043 Views

Hi Luis,

The first option :D. Say I detect 4 faces on a frame, and I want age and gender for all of them in parallel (if I have 4 CPU cores, I'd like to each one process one of the faces).

My question was more related to the python API. If I spawn 4 Threads and each call the same exec_network.infer(...face...):

- Does the exec_network "lock" execution for the OTHER threads (think, like python global interpreter lock)? Or it will accept the calls from other threads and execute them in parallel?

- If it locks, can I create 4 exec_network(s) and manually call infer on each of them?.

I believe it shouldn't lock (so no need to create extra networks). Can you confirm that?

0 Kudos
Poca__Ramon
Beginner
1,043 Views

Better example: I want to run facial landmarks network on each face on each frame to paint a clown nose in real time over each nose, so I want to run as many landmarks calculation in parallel as possible.

0 Kudos
Luis_at_Intel
Moderator
1,044 Views

Thanks for the clarification. I believe there are some parallelization mentioned in the documentation for the CPU plugin usage (since you mentioned having 4 cores) and specific optimizations for the device. I am not sure if spawning threads from Python would speed up your application, I'd suggest to take a look at the supported configuration parameters to better utilize CPU-specific settings. 

But to better answer your question, I don't think the exec_network locks. For more information please check the ExecutableNetwork class reference in the documentation. Hope this answers your question!

 

Regards,

Luis

0 Kudos
Reply