Issue running several inferences with RPI + NCS2

Diaz-Guerra__David · ‎06-22-2020

Hello,

I have developed a C++ application using OpenFrameworks and openVINO which performs a NCS2 inference for each audio buffer. The application runs fine in Ubuntu, however, it breaks with the second audio buffer (with the first one the inference runs without problems).

In summary, when it receives an audio buffer, it makes some audio signal processing which generates a 5D tensor which is stored in a cv:Mat and then it uses the following code (based on the hello_classification example) to perform the inference and get the result:

// Create infer request
InferRequest infer_request = executable_network.CreateInferRequest();

// Prepare input
InferenceEngine::TensorDesc tDesc(InferenceEngine::Precision::FP32,
{1,3,37,16,32},
InferenceEngine::Layout::NCDHW);

Blob::Ptr audioBlob = InferenceEngine::make_shared_blob<float>(tDesc, netoworkInput.ptr<float>()); // just wrap Mat data by Blob::Ptr without allocating of new memory

infer_request.SetBlob(network_input_name, audioBlob); // infer_request accepts input blob of any size
// 7. Do inference
/* Running the request synchronously */
infer_request.Infer();

// Process output
Blob::Ptr output = infer_request.GetBlob(network_output_name);

using myBlobType = InferenceEngine::PrecisionTrait<InferenceEngine::Precision::FP32>::value_type;
InferenceEngine::TBlob<myBlobType>& tblob = dynamic_cast<InferenceEngine::TBlob<myBlobType>&>(*output);
auto networkOutput = tblob.data();

printf("x=%f,y=%f,z=%f\n",networkOutput[0], networkOutput[1], networkOutput[2]);

As I said, the first time this function is called everything runs nice, but the second time, in the infer_request.GetBlob(network_output_name) I get the following error:

terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException' 
    what(): 
Aborted

Since it does not provide any information about what has generated the error and the program run without problems in Ubuntu, I do not know what can be happening.

I have tryied to modify the hello_classification example provided with openVINO to perform two inferences instead of one and, as in my application, it runs well in Ubuntu but it breaks in the infer_request.getBlob after the second inference. This is the code to reproduce the error:

#include <vector>
#include <memory>
#include <string>
#include <samples/common.hpp>

#include <inference_engine.hpp>
#include <details/os/os_filesystem.hpp>
#include <samples/ocv_common.hpp>
#include <samples/classification_results.h>

using namespace std;
using namespace InferenceEngine;

#if defined(ENABLE_UNICODE_PATH_SUPPORT) && defined(_WIN32)
#define tcout std::wcout
#define file_name_t std::wstring
#define WEIGHTS_EXT L".bin"
#define imread_t imreadW
#define ClassificationResult_t ClassificationResultW
#else
#define tcout std::cout
#define file_name_t std::string
#define WEIGHTS_EXT ".bin"
#define imread_t cv::imread
#define ClassificationResult_t ClassificationResult
#endif

#if defined(ENABLE_UNICODE_PATH_SUPPORT) && defined(_WIN32)
cv::Mat imreadW(std::wstring input_image_path) {
    cv::Mat image;
    std::ifstream input_image_stream;
    input_image_stream.open(
        input_image_path.c_str(),
        std::iostream::binary | std::ios_base::ate | std::ios_base::in);
    if (input_image_stream.is_open()) {
        if (input_image_stream.good()) {
            std::size_t file_size = input_image_stream.tellg();
            input_image_stream.seekg(0, std::ios::beg);
            std::vector<char> buffer(0);
            std::copy(
                std::istream_iterator<char>(input_image_stream),
                std::istream_iterator<char>(),
                std::back_inserter(buffer));
            image = cv::imdecode(cv::Mat(1, file_size, CV_8UC1, &buffer[0]), cv::IMREAD_COLOR);
        } else {
            tcout << "Input file '" << input_image_path << "' processing error" << std::endl;
        }
        input_image_stream.close();
    } else {
        tcout << "Unable to read input file '" << input_image_path << "'" << std::endl;
    }
    return image;
}

int wmain(int argc, wchar_t *argv[]) {
#else
int main(int argc, char *argv[]) {
#endif
    try {
        // ------------------------------ Parsing and validation of input args ---------------------------------
        if (argc != 4) {
            tcout << "Usage : ./hello_classification <path_to_model> <path_to_image> <device_name>" << std::endl;
            return EXIT_FAILURE;
        }

        const file_name_t input_model{argv[1]};
        const file_name_t input_image_path{argv[2]};
#if defined(ENABLE_UNICODE_PATH_SUPPORT) && defined(_WIN32)
        const std::string device_name = InferenceEngine::details::wStringtoMBCSstringChar(argv[3]);
#else
        const std::string device_name{argv[3]};
#endif
        // -----------------------------------------------------------------------------------------------------

        // --------------------------- 1. Load inference engine instance -------------------------------------
        Core ie;
        // -----------------------------------------------------------------------------------------------------

        // --------------------------- 2. Read IR Generated by ModelOptimizer (.xml and .bin files) ------------
        CNNNetwork network = ie.ReadNetwork(input_model, input_model.substr(0, input_model.size() - 4) + WEIGHTS_EXT);
        network.setBatchSize(1);
        // -----------------------------------------------------------------------------------------------------

        // --------------------------- 3. Configure input & output ---------------------------------------------
        // --------------------------- Prepare input blobs -----------------------------------------------------
        InputInfo::Ptr input_info = network.getInputsInfo().begin()->second;
        std::string input_name = network.getInputsInfo().begin()->first;

        /* Mark input as resizable by setting of a resize algorithm.
         * In this case we will be able to set an input blob of any shape to an infer request.
         * Resize and layout conversions are executed automatically during inference */
        input_info->getPreProcess().setResizeAlgorithm(RESIZE_BILINEAR);
        input_info->setLayout(Layout::NHWC);
        input_info->setPrecision(Precision::U8);

        // --------------------------- Prepare output blobs ----------------------------------------------------
        DataPtr output_info = network.getOutputsInfo().begin()->second;
        std::string output_name = network.getOutputsInfo().begin()->first;

        output_info->setPrecision(Precision::FP32);
        // -----------------------------------------------------------------------------------------------------

        // --------------------------- 4. Loading model to the device ------------------------------------------
        ExecutableNetwork executable_network = ie.LoadNetwork(network, device_name);
        // -----------------------------------------------------------------------------------------------------

        // --------------------------- 5. Create infer request -------------------------------------------------
        InferRequest infer_request = executable_network.CreateInferRequest();
        // -----------------------------------------------------------------------------------------------------

        // --------------------------- 6. Prepare input --------------------------------------------------------
        /* Read input image to a blob and set it to an infer request without resize and layout conversions. */
        cv::Mat image = imread_t(input_image_path);
	cout << "--- Image read\n";
	cout << "------ Channels: " << image.channels() << endl;
	cout << "------ Height: " << image.size().height << endl;
	cout << "------ Width: " << image.size().width << endl;
	cout << "------ StrideH: " << image.step.buf[0] << endl;
	cout << "------ StrideW: " << image.step.buf[1] << endl;
        Blob::Ptr imgBlob = wrapMat2Blob(image);  // just wrap Mat data by Blob::Ptr without allocating of new memory
        infer_request.SetBlob(input_name, imgBlob);  // infer_request accepts input blob of any size
        // -----------------------------------------------------------------------------------------------------

        // --------------------------- 7. Do inference --------------------------------------------------------
        /* Running the request synchronously */
        infer_request.Infer();
        // -----------------------------------------------------------------------------------------------------

        // --------------------------- 8. Process output ------------------------------------------------------
        Blob::Ptr output = infer_request.GetBlob(output_name);
        // Print classification results
        ClassificationResult_t classificationResult(output, {input_image_path});
        classificationResult.print();
        // -----------------------------------------------------------------------------------------------------



		// -----------------------------------------------------------------------------------------------------
		// ------------------------------- SECOND INFERENCE ----------------------------------------------------
		// -----------------------------------------------------------------------------------------------------



        // --------------------------- 5. Create infer request -------------------------------------------------
        InferRequest infer_request_2 = executable_network.CreateInferRequest();
        // -----------------------------------------------------------------------------------------------------

        // --------------------------- 6. Prepare input --------------------------------------------------------
        /* Read input image to a blob and set it to an infer request without resize and layout conversions. */
        cv::Mat image_2 = imread_t(input_image_path);
	cout << "--- Image read\n";
	cout << "------ Channels: " << image_2.channels() << endl;
	cout << "------ Height: " << image_2.size().height << endl;
	cout << "------ Width: " << image_2.size().width << endl;
	cout << "------ StrideH: " << image_2.step.buf[0] << endl;
	cout << "------ StrideW: " << image_2.step.buf[1] << endl;
        Blob::Ptr imgBlob_2 = wrapMat2Blob(image_2);  // just wrap Mat data by Blob::Ptr without allocating of new memory
        infer_request_2.SetBlob(input_name, imgBlob_2);  // infer_request accepts input blob of any size
        // -----------------------------------------------------------------------------------------------------

        // --------------------------- 7. Do inference --------------------------------------------------------
        /* Running the request synchronously */
        infer_request_2.Infer();
        // -----------------------------------------------------------------------------------------------------

        // --------------------------- 8. Process output ------------------------------------------------------
        Blob::Ptr output_2 = infer_request_2.GetBlob(output_name);
        // Print classification results
        ClassificationResult_t classificationResult_2(output_2, {input_image_path});
        classificationResult_2.print();
        // -----------------------------------------------------------------------------------------------------
    } catch (const std::exception & ex) {
        std::cerr << ex.what() << std::endl;
        return EXIT_FAILURE;
    }
    std::cout << "This sample is an API example, for any performance measurements "
                 "please use the dedicated benchmark_app tool" << std::endl;
    return EXIT_SUCCESS;
}

However, in this case, the error gives more information:

output blob size is not equal to the network output size: got 1000 expecting 2954482816 
hello_classification: ../--/libusb/io.c:2116: handle_events: Assertion `ctx->pollfds_cnt >= internal_nfds' failed 
Aborted

I do not know if the reason for this error is the same than in my program, but I do not know either why it is expecting such a high size blob since I am using Alexnet (which has 1000 output classes) and it run fine with the first inference and in Ubuntu.

Best regards,

David

David_C_Intel · ‎06-22-2020

Hi David,

Thanks for reaching out. Could you please answer the following:

Which OpenVINO™ toolkit version are you using?
Which OS versions are you using (Ubuntu and Raspbian)? Is the issue present in both systems?
If possible, please share your model, an input sample and an expected output. Also, the complete custom code you used for us to test on our end.

Regards,

David

Diaz-Guerra__David · ‎06-27-2020

Hi David, thank you for your interest.

I was using openVINO 2020.1.023 but now I updated it to 2020.3 LTS and the problem is still there. I'm using Ubuntu 18.04.4 LTS, but everything runs fine there, the issue is only in Raspbian 10 (buster).

I don't have any problems with sharing the code of my main app and my model, but it's an openFrameworks app, so you would need openFrameworks to build it, and, even having built it, you would need some specific audio interface to properly run it. Fortunately, I could reproduce a similar error just by modifying the hello_classification example that comes with openVINO, so I think it will be easier to work with that. I just repeated the code to run a second inference after the first one included in the original example (the full code is in my original message). I am using Alexnet as model, its weight exceeds the limits of this forum, so I sent you the specific IR I am using here: https://drive.google.com/file/d/122ifnZeYAJGSoXTdTdeF4vC19rYJ_Em8/view?usp=sharing

The error is quite tricky since it only happens in Raspbian and sometimes the code works nicely the first time you run it after compiling, but it breaks after a couple of executions. I don't know what can be happening.

Best regards,

David

David_C_Intel · ‎06-29-2020

Hi David,

Thanks for the information given. We are currently looking into your issue, we will come back as soon as possible.

Best regards,

David

David_C_Intel · ‎07-02-2020

Hi David,

Thank you for your patience. We tested the hello_classification sample executing two inferences as you did with a Raspberry Pi 4 (Raspbian OS Buster image), OpenVINO 2020.3 and we got no errors. Also, we tested it using a fresh SD card with Raspbian OS Buster image and still no errors. We recommend to try to retest it on a fresh Raspberry Pi image with the pre-built package of OpenVINO™ toolkit or building the open sourced version of OpenVINO™ toolkit (DLDT).

Best regards,

David C.

Diaz-Guerra__David · ‎07-02-2020

Thank you very much for you help David.

I'll try to reinstall Raspbian and openVINO in my Raspberry Pi.

Best regards,

David

Diaz-Guerra__David · ‎07-28-2020

I've finally moved to a raspi 4 with a new installation of Raspbian and I also could run the hello_classification sample with two inferences. However, I'm still struggling to run my app on it (thought it run fine in Ubuntu).

As I said, I am using openFrameworks to build app so, to make issue easier to reproduce, I tried to use my model just using C++ and openVINO. By doing that, I am able to run any number of inferences without having any issue but, the first time I ran it after compiling it, I got the following warning:

E: [xLink] [    141251] [hello_classific] DispatcherWaitEventComplete:367	Assertion Failed: curr != NULL 

E: [global] [    141251] [hello_classific] XLinkResetRemote:246	Condition failed: DispatcherWaitEventComplete(&link->deviceHandle)

I don't know what this means or if it can be related with my issue.

Since I couldn't reproduce the issue without using openFrameworks, I tried to avoid using any specific audio interface and, instead of performing an inference of the model each time I get an audio buffer, perform the inference when I press a key in the keyboard (using an all-zeros input). With that, the app breaks after a random number of inferences (usually not more than 5). However, now I get the following error message:

terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException'
  what():  Output blob size is not equal to the network output size: got 3 expecting 146753064
Aborted

I don't know why it says that it is expecting an output with such a high size (the number changes each time I run the app but it is always ridiculously high) when it should be 3 and in the previous inferences it worked fine with a output blob of size 3.

Do you have any idea about why could this be happening?

Best regards,

David