When I run a typical convolution network on the CPU, the execution time is very unpredictable.
If I run inferencing (net.infer()) in a loop:
I get a time of 30 milliseconds with very little variation from run to run.
However, when I interleave multiple networks and invoke them alternately:
the average performance drops significantly to 67 milliseconds. The worst case is over 200 milliseconds. Is this a problem with cache performance? Is there any way to improve this?
Processor: core i7 8700K with 32 GB RAM
OS: Ubuntu 16.04
OpenVINO version: 2018.2.319 (dated July 2018)
Network: 4 convolution layers + 3 FC layers
Model file size: ~ 8MB (32 bit float)
Target: CPU, AVX2 (SSE4 also gives similar performance)
Sorry for the late response because we are busy on multiple requests.
I want to reproduce what you observed, could you tell me the steps?
The model and the sample you are using, which code you used for inference?