Solved: OpenVino inference (latency and throughput) for multiple batches

Shravanthi · ‎10-25-2023

Hello,

I am working on implementation of collecting inferences for 1000 images/iterations for different models with multiple batches. I referred to the sample code given in the documentation for single batch and collected inference results in FPS, the results were matching the results published for Intel Core i7 but when i customized code for multiple batch size my throughput (FPS) was reducing instead of increasing. I changed the batch size of the model using function model.reshape([batch_size,3,H,W]) and below is the inference code I used earlier. Later, I modified the latency equation to latency = time_ir/(num_images*batch_size) then it is observed that the numbers were increasing comparative to lower batch size. Could you please confirm if my approach is correct or any changes required. I have attached script for reference.

Timing code used earlier

import time
num_images = 1000

start = time.perf_counter()

for _ in range(num_images):
compiled_model([input_image])

end = time.perf_counter()
time_ir = end - start
print('time',time_ir)
print(
f"Batch size: {batch_size} \n"
f"IR model in OpenVINO Runtime/CPU: {(time_ir/(num_images)):.4f} \n"
f"seconds per image, FPS: {num_images/time_ir:.2f}")

Modified timing code

import time
num_images = 1000

start = time.perf_counter()

for _ in range(num_images):
compiled_model([input_image])

end = time.perf_counter()
time_ir = end - start
print('time:',time_ir)
latency = time_ir/(num_images*batch_size)
print("latency:",latency)
print("throughput:", 1/latency)

Results obtained from modified code:

Regards,

Shravanthi J

Iffa_Intel · ‎10-30-2023

Hi,

For better understanding & simplicity, I suggest you use OpenVINO PyPI.

Your original format model (ONNX) has dynamic shape [?,3,224,224].

To infer it, you need to feed in its data shape as below:

As I mentioned previously, the shape is usually managed by OpenVINO Model Optimizer.

For conversion, you can provide desired shape to the MO(Model Optimizer)

IR with Dynamic shape [-1,3,224,224]:

mo -m model.onnx -input_shape [-1,3,224,224]

Infer it with benchmark_app (you can feed -1 value into desired value):

benchmark_app -m model.xml --data_shape [5,3,224,224]

If you want to change the IR model's shape, simply run your ONNX model again with MO, however, bear in mind that the shape parsed MUST align with your original ONNX shape.

Each time you make changes to your original model, you need to run it again with MO to generate IR model that reflect your latest changes.

Regarding to your question for FPS, my finding is:

Batch 1-19.60 FPS

Batch 2-25.56

Batch 4-28.11FPS

Batch 8- 27.66

If FPS does not improve when batch size is increased, perhaps because it's an optimum size for your model.

However, bear in mind that there are other factors that affect FPS (system background activities, inference precision,etc).

You may refer here.

To write your own codes like the one in OpenVINO Notebook, you need to use OpenVINO convert_model() function to convert model into IR and to infer you'll need to further study OpenCV and cater the example 109-throughput-tricks to your use case.

You may also refer to OpenVINO Open Model Zoo for more inferencing examples.

Cordially,

Iffa

View solution in original post

Iffa_Intel · ‎10-25-2023

Hi,

could you share:

OpenVINO version you are using
Did you use OpenVINO Notebook as the code you shared is a Notebook format (.ipynb)?
Relevant model files
Did you custom/change anything to the model?

Cordially

Iffa

Shravanthi · ‎10-26-2023

Hi,

OpenVINO version you are using - 2023.1
Did you use OpenVINO Notebook as the code you shared is a Notebook format (.ipynb)? - yes I used the sample code shared in documentation but i did some customizations for multi batch inferencing
Relevant model files - resnet50-v1-7.onnx, source : models/vision/classification/resnet/model/resnet50-v1-7.onnx at main · onnx/models · GitHub
Did you custom/change anything to the model? - only batch size was changed

Regards,

Shravanthi J

Iffa_Intel · ‎10-26-2023

From my side, the FPS for multiple batch is slightly better compared to the single batch

However, for the multiple batch, I didn't see the part where you re-define batch_size to be larger than 1, hence that is actually still an inference for a single batch.

This batching is usually handled by OpenVINO Model Optimizer, specifically defined during conversion from original model format into IR process.

To use larger batch, the code is not as straightforward as the one you shared. You must convert the model again, specifying a new input shape, and reshape input frames.

You may refer to the 109-throughput_tricks specifically at the section OpenVINO IR + bigger batch.

Cordially,

Iffa

Shravanthi · ‎10-27-2023

Hi,

Below is the code for re-defining batch size, here I have reshaped the model and input image to batch size 16 and then collected inferences for larger batches. Is this not the right way to collect inferences for multiple batches and is it mandatory to convert to IR format for re-defining batch size or making any changes to the model ?

Regards,

Shravanthi J

Iffa_Intel · ‎10-30-2023

Hi,

For better understanding & simplicity, I suggest you use OpenVINO PyPI.

Your original format model (ONNX) has dynamic shape [?,3,224,224].

To infer it, you need to feed in its data shape as below:

As I mentioned previously, the shape is usually managed by OpenVINO Model Optimizer.

For conversion, you can provide desired shape to the MO(Model Optimizer)

IR with Dynamic shape [-1,3,224,224]:

mo -m model.onnx -input_shape [-1,3,224,224]

Infer it with benchmark_app (you can feed -1 value into desired value):

benchmark_app -m model.xml --data_shape [5,3,224,224]

If you want to change the IR model's shape, simply run your ONNX model again with MO, however, bear in mind that the shape parsed MUST align with your original ONNX shape.

Each time you make changes to your original model, you need to run it again with MO to generate IR model that reflect your latest changes.

Regarding to your question for FPS, my finding is:

Batch 1-19.60 FPS

Batch 2-25.56

Batch 4-28.11FPS

Batch 8- 27.66

If FPS does not improve when batch size is increased, perhaps because it's an optimum size for your model.

However, bear in mind that there are other factors that affect FPS (system background activities, inference precision,etc).

You may refer here.

To write your own codes like the one in OpenVINO Notebook, you need to use OpenVINO convert_model() function to convert model into IR and to infer you'll need to further study OpenCV and cater the example 109-throughput-tricks to your use case.

You may also refer to OpenVINO Open Model Zoo for more inferencing examples.

Cordially,

Iffa

Iffa_Intel · ‎11-02-2023

Hi,

Intel will no longer monitor this thread since this issue has been resolved. If you need any additional information from Intel, please submit a new question.

Cordially,

Iffa