I've read somewhere that the Intel Neural Computer Stick actually uses half of its ability of number of inference requests is set to 1, and to get the maximum performance, there has to be 4 inference requests.
Is this so and officially approved by Intel? And if so, why is that so?
Thank you for posting on the Intel️® communities. We moved this thread to the correct forum, the team in charge will get back to you soon.
Intel Customer Support Technician
OpenVINO Inference Engine allows you to perform inference in synchronous and asynchronous modes with arbitrary number of infer requests (the number of infer requests may be limited by target device capabilities).
For Intel® Neural Compute Stick 2, four inference requests for each Executable Network are recommended for optimum performance.