We are trying handle NCS2 USB connection error case in our python code.
Our openvino version is 2021.4.2 and using inference_engine python API.
We load and serves model in following process
1. Load IR file using IENetwork
2. Load IENetwork on device (MYRIAD) using IECore.load_network function
3. Infer through start_async function
What we are trying to handle is the case of NCS2 USB connection failure.
Since our working environment is on dynamically moving robot, USB connection is unstable.
So, when it fails, we expect the model to be loaded in other idle NCS2 device.
(e.g., 3 NCS2 are connected and 2 models to be served. 1 NCS2 device is idle)
We used timeout interrupt on function "start_async()" to determine whether NCS is alive or not.
E: [global] [ 836527] [Scheduler00Thr] dispatcherEventSend:54 Write failed (header) (err -4) | event XL
Looking at your use case, it is best to use the Official OpenVINO Multi-Device Plugin. This plugin automatically assigns inference requests to available computational devices to execute the requests in parallel. Once one device with higher priority fails or missing, it would revert to another.
You may refer here.
Thanks for your kind response!
Its very close to the method what we were finding!
We adopted multi-device plugin into our serving code and came up with a few additional questions.
1. Does the plugin work as a master and slave structure?
Lets say we connected 2 neural compute sticks.
We assigned these 2 devices into a executable network using multi-device plugin.
If device numbered 2 is disconnected, the model serving still works. (I guess its because of another device, numbered 1)
However when device numbered 1 is disconnected, it always fail and raise runtime error though device numbered 2 is still connected.
Below is the error raised in this situation.
Traceback (most recent call last):
File "/ssd/hq_patrol_vision/src/emergency_action/src/emergency_action_node.py", line 126, in main
if stat.action_on and action_model.is_get_results():
File "/ssd/hq_patrol_vision/src/emergency_action/src/action/action_wrapper.py", line 66, in is_get_results
File "/ssd/hq_patrol_vision/src/emergency_action/src/action/openvino/ncs_model.py", line 161, in is_get_results
if self.exec_net.requests[req].wait(0) == 0:
File "ie_api.pyx", line 1243, in openvino.inference_engine.ie_api.InferRequest.wait
File "ie_api.pyx", line 1268, in openvino.inference_engine.ie_api.InferRequest.wait
RuntimeError: [ GENERAL_ERROR ]
Is this because that device numbered 1 is a master node?
2. As I have mentioned in the original questions, when neural compute stick is disconnected during serving,
following error messages are continuously created and do not disappear.
E: [xLink] [ 215984] [Scheduler00Thr] sendEvents:998 Event sending failed
E: [global] [ 216984] [Scheduler00Thr] dispatcherEventSend:53 Write failed (header) (err -4) | event XLINK_WRITE_REQ
It also happens in multi-device plugin (say, device 1 and 2 are assigned and, device 2 is disconnected during serving).
Is it safe though the threads or processes which produce following messages remain in background?
Could you provide us with some information regarding your Multi-Device Plugin configuration? (eg: device priority assignment)
There are 3 ways to specify the device target for Multi-Device Plugin as mentioned here:
1. Pass a Prioritized List as a Parameter in ie.load_network()
2. Pass a List as a Parameter, and Dynamically Change Priorities during Execution Notice that the priorities of the devices can be changed in real time for the executable network
3. Use Explicit Hints for Controlling Request Numbers Executed by Devices
As of the warning, you could ignore them as long as there's no error produced.
Sorry of late reply.
Our implementation was like below,
all_devices = MULTI:MYRIAD.1.2-ma2480,MYRIAD.1.4-ma2480
exec_net = ie.load_network(network=net, device_name=all_devices)
This is the 1st way in your answer.
From your answer, we suspect the priority as a reason of inference failure when first device (MYRIAD.1.2-ma2480) USB connection fails.
In our case, the inference still works when MYRIAD.1.4-ma2480 device is detached (although it produces warning messages).
However, when MYRIAD.1.2-ma2480 (first device) is detached, the inference permanently fails though second device is connected well.
Should I blame a priority of the first device?
I also wonder whether this be prevented by dynamically changing priorities (2nd way in your answer).
Thanks for answering!
Intel will no longer monitor this thread since we have provided a solution. If you need any additional information from Intel, please submit a new question.