Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6480 Discussions

OpenVino inference (latency and throughput) for multiple batches

Shravanthi
Beginner
2,160 Views

Hello,

I am working on implementation of collecting inferences for 1000 images/iterations for different models with multiple batches. I referred to the sample code given in the documentation for single batch and collected inference results in FPS, the results were matching the results published for Intel Core i7 but when i customized code for multiple batch size my throughput (FPS) was reducing instead of increasing. I changed the batch size of the model using function model.reshape([batch_size,3,H,W]) and below is the inference code I used earlier. Later, I modified the latency equation to latency = time_ir/(num_images*batch_size) then it is observed that the numbers were increasing comparative to lower batch size. Could you please confirm if my approach is correct or any changes required. I have attached script for reference.

Timing code used earlier

import time
num_images = 1000

start = time.perf_counter()

for _ in range(num_images):
compiled_model([input_image])

end = time.perf_counter()
time_ir = end - start
print('time',time_ir)
print(
f"Batch size: {batch_size} \n"
f"IR model in OpenVINO Runtime/CPU: {(time_ir/(num_images)):.4f} \n"
f"seconds per image, FPS: {num_images/time_ir:.2f}")

 

Modified timing code

import time
num_images = 1000

start = time.perf_counter()

for _ in range(num_images):
compiled_model([input_image])

end = time.perf_counter()
time_ir = end - start
print('time:',time_ir)
latency = time_ir/(num_images*batch_size)
print("latency:",latency)
print("throughput:", 1/latency)

Results obtained from modified code:

Shravanthi_0-1698217518094.png

 

Regards,

Shravanthi J

 

0 Kudos
1 Solution
Iffa_Intel
Moderator
1,960 Views

Hi,

For better understanding & simplicity, I suggest you use OpenVINO PyPI.

 

Your original format model (ONNX) has dynamic shape [?,3,224,224].

To infer it, you need to feed in its data shape as below:

1_benchmarkONNX.png

 

 

As I mentioned previously, the shape is usually managed by OpenVINO Model Optimizer.

For conversion, you can provide desired shape to the MO(Model Optimizer)

 

IR with Dynamic shape [-1,3,224,224]:

mo -m model.onnx -input_shape  [-1,3,224,224]

2_convertir.png

 

Infer it with benchmark_app (you can feed -1 value into desired value):

benchmark_app -m model.xml --data_shape [5,3,224,224]

3_benchmarkir.png

 

If you want to change the IR model's shape, simply run your ONNX model again with MO, however, bear in mind that the shape parsed MUST align with your original ONNX shape.

 

Each time you make changes to your original model, you need to run it again with MO to generate IR model that reflect your latest changes.

 

 

Regarding to your question for FPS, my finding is:

Batch 1-19.60 FPS

4_batch1.png

Batch 2-25.56

5_batch2.png

 

Batch 4-28.11FPS

6_batch4.png

 

Batch 8- 27.66

7_batch8.png

If FPS does not improve when batch size is increased, perhaps because it's an optimum size for your model.

However, bear in mind that there are other factors that affect FPS (system background activities, inference precision,etc).

You may refer here.

 

To write your own codes like the one in OpenVINO Notebook, you need to use OpenVINO convert_model() function to convert model into IR and to infer you'll need to further study OpenCV and cater the example 109-throughput-tricks to your use case.

 

You may also refer to OpenVINO Open Model Zoo for more inferencing examples.

 

Cordially,

Iffa

 

 

 

View solution in original post

0 Kudos
6 Replies
Iffa_Intel
Moderator
2,131 Views

Hi,


could you share:

  1. OpenVINO version you are using
  2. Did you use OpenVINO Notebook as the code you shared is a Notebook format (.ipynb)?
  3. Relevant model files
  4. Did you custom/change anything to the model?



Cordially

Iffa



0 Kudos
Shravanthi
Beginner
2,119 Views

Hi,

 

  1. OpenVINO version you are using - 2023.1
  2. Did you use OpenVINO Notebook as the code you shared is a Notebook format (.ipynb)? - yes I used the sample code shared in documentation but i did some customizations for multi batch inferencing
  3. Relevant model files - resnet50-v1-7.onnx, source : models/vision/classification/resnet/model/resnet50-v1-7.onnx at main · onnx/models · GitHub
  4. Did you custom/change anything to the model? - only batch size was changed

 

Regards,

Shravanthi J

0 Kudos
Iffa_Intel
Moderator
2,089 Views

From my side, the FPS for multiple batch is slightly better compared to the single batch

 

Iffa_Intel_0-1698368555245.png

However, for the multiple batch, I didn't see the part where you re-define batch_size to be larger than 1, hence that is actually still an inference for a single batch.

 

This batching is usually handled by OpenVINO Model Optimizer, specifically defined during conversion from original model format into IR process.

To use larger batch, the code is not as straightforward as the one you shared.  You must convert the model again, specifying a new input shape, and reshape input frames.

You may refer to the 109-throughput_tricks specifically at the section OpenVINO IR + bigger batch.

 

Cordially,

Iffa

 

 

0 Kudos
Shravanthi
Beginner
2,071 Views

Hi,

 

Below is the code for re-defining batch size, here I have reshaped the model and input image to batch size 16 and then collected inferences for larger batches. Is this not the right way to collect inferences for multiple batches and is it mandatory to convert to IR format for re-defining batch size or making any changes to the model ?

 

Shravanthi_1-1698393699031.png

 

Shravanthi_0-1698393670289.png

Regards,

Shravanthi J

 

0 Kudos
Iffa_Intel
Moderator
1,961 Views

Hi,

For better understanding & simplicity, I suggest you use OpenVINO PyPI.

 

Your original format model (ONNX) has dynamic shape [?,3,224,224].

To infer it, you need to feed in its data shape as below:

1_benchmarkONNX.png

 

 

As I mentioned previously, the shape is usually managed by OpenVINO Model Optimizer.

For conversion, you can provide desired shape to the MO(Model Optimizer)

 

IR with Dynamic shape [-1,3,224,224]:

mo -m model.onnx -input_shape  [-1,3,224,224]

2_convertir.png

 

Infer it with benchmark_app (you can feed -1 value into desired value):

benchmark_app -m model.xml --data_shape [5,3,224,224]

3_benchmarkir.png

 

If you want to change the IR model's shape, simply run your ONNX model again with MO, however, bear in mind that the shape parsed MUST align with your original ONNX shape.

 

Each time you make changes to your original model, you need to run it again with MO to generate IR model that reflect your latest changes.

 

 

Regarding to your question for FPS, my finding is:

Batch 1-19.60 FPS

4_batch1.png

Batch 2-25.56

5_batch2.png

 

Batch 4-28.11FPS

6_batch4.png

 

Batch 8- 27.66

7_batch8.png

If FPS does not improve when batch size is increased, perhaps because it's an optimum size for your model.

However, bear in mind that there are other factors that affect FPS (system background activities, inference precision,etc).

You may refer here.

 

To write your own codes like the one in OpenVINO Notebook, you need to use OpenVINO convert_model() function to convert model into IR and to infer you'll need to further study OpenCV and cater the example 109-throughput-tricks to your use case.

 

You may also refer to OpenVINO Open Model Zoo for more inferencing examples.

 

Cordially,

Iffa

 

 

 

0 Kudos
Iffa_Intel
Moderator
1,875 Views

Hi,


Intel will no longer monitor this thread since this issue has been resolved. If you need any additional information from Intel, please submit a new question.


Cordially,

Iffa


0 Kudos
Reply