Are there any benefits in using a batch size bigger than 1 on CPU as opposed to allocating the same model twice other than saving some RAM?
I have 2 threads. Each one needs to access the OpenVINO model. As of now each model initialize and access a dedicated instance of the model. I am not concerned about memory usage, but about processing speed. Would it be beneficial to allocate only a single instance of the model with a batch size equal to 2?
I know that increasing the batch size might be beneficial on GPUs but this model is running on CPU. I am wondering if there is an expected speedup on CPUs.
The batch size depends on how you are feeding the model with.
Let's say , if you are using 4 RGB images at once, the input shape would be [4,3,277,277].
This is equivalent to using batch size 4.
There's pack of 4 images together and the inference would be done on all of these 4 at once.
This would produce slower response time compared to using 1 image.
However, the efficiency/fps, would be higher.