classification_sample_async python sample is not doing async inference

Truong__Dien_Hoa · ‎03-08-2019

I'm trying to check the inference speed of resnet50 on my UP board integrated with MYRIAD X. I first used the benchmark for python and found that MYRIAD in mode async is really good that it obtain 25 FPS, almost double the second one (MYRIAD in sync mode: 15 FPS, GPU in sync or async: 12 FPS).

Then I tried it in classification_async python sample. However, there are no difference between classification_sync or async. It is quite weird that I check in the source code of async classification and found in inference part:

    for i in range(args.number_iter):
        t0 = time()
        infer_request_handle = exec_net.start_async(request_id=0, inputs={input_blob: images})
        infer_request_handle.wait()
        infer_time.append((time() - t0) * 1000)

It infers always the request_id 0 then wait for it to finish. So actually, it is not at all async but sync.

This problem is not in the C++ source code:

        for (int iter = 0; iter < FLAGS_ni + FLAGS_nireq; ++iter) {
            if (iter < FLAGS_ni) {
                inferRequests[currentInfer].StartAsync();
            }
            inferRequests[prevInfer].Wait(InferenceEngine::IInferRequest::WaitMode::RESULT_READY);

            currentInfer++;
            if (currentInfer >= FLAGS_nireq) {
                currentInfer = 0;
            }
            prevInfer++;
            if (prevInfer >= FLAGS_nireq) {
                prevInfer = 0;
            }
        }

So I will try to correct the python version following by the C++ version. However, is my discovery is true ?

p/s: By curiosity, How async mode in MYRIAD is to good away other modes ? How async works ? I guess we give a seperate thread for each inference so on VPU we can handle things in parallel (like numerous cores), so how many requests we can give for MYRIAD ?

Thank you in advance,

Truong__Dien_Hoa · ‎03-08-2019

After adapting the async C++ code in Python version. It works much faster. So I think the Python async code should be modified in the next release.

ruan__jiayang · ‎05-09-2019

Hello Truong:

May I know more about your understanding of the async mode?

Say, I have one image IM, two neural networks (nn1, nn2), and a NCS2. I aim to send IM to nn1 and nn2. In sync mode, of course I should sent IM to nn1, wait until it finishes, and then send IM to nn2. Can I save some time using asyc mode? Can I send IM to nn1 and then nn2, and then wait for the results of the two neural networks?