- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I need to optimise my openvino object detection code. I found that it is possible to run multiple instances of the OpenVINO toolkit on each of the processor cores. Could you please brief me what I should do for that. Any sample code will be helpful. Could you please support as early as possible.
regards,
Gina
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Gina,
What inference device, CPU, GPU other? What OS, Linux or Windows, also Python or C++ API? Actually it does not matter too much provided you async your operations and feed the inference device fast enough once you get to over 90% device load that's all you can get. I found that the best example on how to optimize inference speed is in the async samples ( object_detection_demo_ssd_async , object_detection_demo_yolov3_async ). Once you get to that point of async I see no point trying to operate at core level. The SDK and plug-ins take care of this in an optimal way, e.g MKLDNN will ensure all CPU cores are utilized and clDNN will make sure all GPU EUs are fully loaded provided you push frames fast enough. Are the asnc samples helpful? Try the -pc parameter to get an idea of inference power. You can then estimate the best speed you can get.
plugin.SetConfig({ { PluginConfigParams::KEY_PERF_COUNT, PluginConfigParams::YES } }); ... printPerformanceCounts(*async_infer_request_curr, std::cout);
nikos
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi nikos,
Need to run the same in CPU. OS:Linux and having Python Inference Engine API
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, that should be possible too. I don't use too much python for performance but I can see they support async from python too. Please see the relevant sections in the documentation and also refer to a few good async python samples posted in this forum. I think there are some good async python examples by another member in this forum ( https://github.com/PINTO0309 ). I am not sure if OpenVino comes with async python samples too.
async_infer(inputs=None) Description: Starts asynchronous inference of the infer request and fill outputs array Parameters: inputs - a dictionary of input layer name as a key and numpy.ndarray of proper shape with input data for the layer as a value Return value: None Usage example: 1 >>> exec_net = plugin.load(network=net, num_requests=2) 2 >>> exec_net.requests[0].async_infer({input_blob: image}) 3 >>> exec_net.requests[0].wait() 4 >>> res = exec_net.requests[0].outputs['prob'] 5 >>> np.flip(np.sort(np.squeeze(res)),0) 6 array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01, 7 5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03, 8 2.26027006e-03, 2.12283316e-03 ...])
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Gina, if you have a look in the samples provided with OpenVINO 2019-R3 there's a C++ and Python Benchmark application supplied. It detects number of CPU cores and optimizes the number of parallel and queued inferences for each specific device eg CPU,GPU, Myriad etc. I highly recommend running this app to see whats possible.
The other think to try and get more speed is to use the MULTI option for your inference eg MULTI: CPU,GPU as that will use the CPU and the on-chip GPU. There's an example in https://github.com/intel-iot-devkit/smart-video-workshop/blob/master/hardware-heterogeneity/Multi-devices.md.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page