Intel® Distribution of OpenVINO™ Toolkit
Community support and discussions about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all things computer vision-related on Intel® platforms.
6023 Discussions

## Asynchronous inference on GPU results in same outputs for different inputs

Beginner
1,396 Views

Hardware Configurations:

• CPU: Intel Core i9-10900
• RAM: 16GB (DDR4-2933)
• OS: Windows 10 Pro 21H1 (Build 19043.1288)

Software Versions:

• OpenVINO 2021.4.689 (Installed from Intel Distribution of OpenVINO toolkit package)
• Python 3.8.10
• CMake 3.21.3
• Microsoft Visual Studio 2017 Express

How to reproduce:

Add this after line 125 in classification_sample_async.py of Image Classification Async Python Sample (https://docs.openvino.ai/latest/openvino_inference_engine_ie_bridges_python_sample_classification_sa...)

import time; time.sleep(1)

then run this sample with following command

python classification_sample_async.py -i image1.jpeg image2.jpeg -m path\to\classification_model.xml -d GPU

This will result in same output for both image, even if input images are different (Example using mobilenet-v3-large-1.0-224-tf from open model zoo below, but same result can be reproduced with other models)

[ INFO ] Creating Inference Engine
[ INFO ] Reading the network: mobilenet-v3-large-1.0-224-tf\FP32\mobilenet-v3-large-1.0-224-tf.xml
[ INFO ] Configuring input and output blobs
[ WARNING ] Image image1.jpeg is resized from (212, 320) to (224, 224)
[ WARNING ] Image image2.jpeg is resized from (250, 239) to (224, 224)
[ INFO ] Starting inference in asynchronous mode
[ INFO ] Infer request 1 returned 0
[ INFO ] Image path: image2.jpeg
[ INFO ] Top 10 results:
[ INFO ] classid probability
[ INFO ] -------------------
[ INFO ] 802     0.4026092
[ INFO ] 837     0.0943944
[ INFO ] 877     0.0614209
[ INFO ] 838     0.0413581
[ INFO ] 151     0.0223625
[ INFO ] 345     0.0208963
[ INFO ] 930     0.0077736
[ INFO ] 7       0.0074957
[ INFO ] 149     0.0072616
[ INFO ] 563     0.0070992
[ INFO ]
[ INFO ] Infer request 0 returned 0
[ INFO ] Image path: image1.jpeg
[ INFO ] Top 10 results:
[ INFO ] classid probability
[ INFO ] -------------------
[ INFO ] 802     0.4026092
[ INFO ] 837     0.0943944
[ INFO ] 877     0.0614209
[ INFO ] 838     0.0413581
[ INFO ] 151     0.0223625
[ INFO ] 345     0.0208963
[ INFO ] 930     0.0077736
[ INFO ] 7       0.0074957
[ INFO ] 149     0.0072616
[ INFO ] 563     0.0070992
[ INFO ]
[ INFO ] This sample is an API example, for any performance measurements please use the dedicated benchmark_app tool
1 Solution
Moderator
1,266 Views

Hi,

We did some digging and found that there is a bug in the classification_sample_async.py.

The issue happened because the GPU plugin allocated a single internal buffer for inputs for all infer requests within a single stream, so when the output data of the first request wasn't read before running the subsequent request, it got overwritten.

The issue had been fixed and it is available on Github https://github.com/openvinotoolkit/openvino/pull/6922 .

For the OpenVINO toolkit distribution, it should be available on the next release which is OpenVINO 2022.1

Sincerely,

Iffa

8 Replies
Moderator
1,369 Views

Hi,

# I had tested the Image Classification Async Python* Sample using 2 images ( frenchmotorway.jpg and trafficphoto.jpg) with Alexnet model that can be downloaded from the OpenVINO Model Downloader.

I don't see any issues before and after the addition of time.sleep(1). Both photos gave out different results.

The only difference is the one with sleep has a delay in giving out the result.

Probably it's your model that have issues.

You may refer to my attachment and compare the results.

Sincerely,

Iffa

Beginner
1,350 Views

Dear Iffa,

Thanks for the response!

I have checked your attachment, but it seems like you only used one image per run. You need to use two images in a single run to reproduce this issue, as I have stated in the running command in the original post.
In your case, the running command to reproduce the issue would be:

python classification_sample_async.py -m "C:\Program Files (x86)\Intel\openvino_2021.4.582\deployment_tools\tools\model_downloader\public\alexnet\FP32\alexnet.xml" -i C:\Users\A221LPEN\Desktop\frenchmotorway.jpg C:\Users\A221LPEN\Desktop\traffic_photo.jpg -d GPU

Can you please try running the sample again and see if you can reproduce the issue?

Sincerely,
tak152

Moderator
1,345 Views

I tried to infer two images in a single run.

The one without sleep works fine but with the sleep, the result does not show any differences as you claimed to be.

This must be caused by the delay that had been inserted at line 125. The program should immediately return an inference status without blocking or interrupting. For single image it's luckily doesn't affect.

Sincerely,

Iffa

Beginner
1,318 Views

Dear Iffa,

Thanks for confirming the issue.

I think this issue is a bug in OpenVINO, since when you infer two images in a single run using the same sample with sleep on CPU (instead of GPU), the issue does not occur (i.e. gives different results for different images), and I would expect the similar result when you infer on GPU (which, in fact, is not and gives same results for different images).
I have attached the result when I ran the sample on CPU, so you can see the result is different than when we ran the sample on GPU.

I would like this issue to be fixed since I want to do real-time asynchronous inference on GPU, and I'm affected by it (outputs for some inputs are same as some others and 'lost').
I already found that some of the workarounds for this issue are to use CPU or to not do more than one inferences at the same time, but I would not want to do that since it would reduce the throughput of the model.

Do you have any fixes/workarounds for this?

Sincerely,
tak152

Moderator
1,311 Views

We'll look further for any possible workaround and get back to you asap.

Sincerely,

Iffa

Moderator
1,267 Views

Hi,

We did some digging and found that there is a bug in the classification_sample_async.py.

The issue happened because the GPU plugin allocated a single internal buffer for inputs for all infer requests within a single stream, so when the output data of the first request wasn't read before running the subsequent request, it got overwritten.

The issue had been fixed and it is available on Github https://github.com/openvinotoolkit/openvino/pull/6922 .

For the OpenVINO toolkit distribution, it should be available on the next release which is OpenVINO 2022.1

Sincerely,

Iffa

Beginner
1,219 Views

Dear Iffa,

Thanks for the reply, good to hear that the fix is available on the next release.

Sincerely,
tak152

Moderator
1,214 Views

Greetings,

Intel will no longer monitor this thread since this issue has been resolved. If you need any additional information from Intel, please submit a new question.

Sincerely,

Iffa