Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

[watchdog] [ 0] sendPingMessage:164 Failed send ping message: X_LINK_ERROR

lilohuang
New Contributor I
1,494 Views

@Sahira_at_Intel 

Could you do me a favor to find the root cause from the below error message? I got it from my UP-BOARD computer (Win10 1709) with NCS2 (through additional power supply USB3.0 Y cable) and OpenVINO 2018 R5.0.1. My code is basically derived from the C:\Intel\computer_vision_sdk\inference_engine\samples\python_samples\object_detection_demo_yolov3.py sample code with a few modifications, and I saw the error with running it continuously 1~2 hours. 

1) Receiving IPCAM streaming frames (30fps) as the input source rather than using Webcam in a dedicated Python worker thread (i.e. threading.Thread)
2) The IEPlugin and IENetwork related objects are all being instantiated/operated in that dedicated worker thread rather than the main thread. 
3) Main thread is used for rendering the object detection result. It communicates with the worker thread through a synchronized queue.

As you've known there are some other people seeing the same error code in https://ncsforum.movidius.com/discussion/1106/ncs-temperature-issue ;
Could you still escalate the issue to RD core team? I'm wondering to know what the error message mean. Thank you.

WinUsb_ReadPipe: System err 2
[35mE: [xLink] [ 0] handleIncomingEvent:240 handleIncomingEvent() Read failed -2
[0m
WinUsb_WritePipe: System err 22
[33mW: [xLink] [ 0] dispatcherEventReceive:324
WinUsb_WritePipe failed with error:=22
[35mE: [xLink] [ 0] dispatcherEventSend:889 Write failed header -2 | event USB_WRITE_REQ
[0m
Failed to handle incoming event[0m
WinUsb_SetPipePolicy: System err 22
[35mE: [xLink] [ 0] dispatcherEventReceive:308 dispatcherEventReceive() Read failed -2 | event 00000010C2B2FEA0 USB_WRITE_REQ
[0m
[35mE: [xLink] [ 0] eventReader:256 eventReader stopped[0m
[35mE: [ncAPI] [ 0] ncGraphQueueInference:3538 Can't send trigger request[0m
[35mE: [watchdog] [ 0] sendPingMessage:164 Failed send ping message: X_LINK_ERROR[0m
[35mE: [watchdog] [ 0] sendPingMessage:164 Failed send ping message: X_LINK_ERROR[0m
[35mE: [watchdog] [ 0] sendPingMessage:164 Failed send ping message: X_LINK_ERROR[0m
[35mE: [watchdog] [ 0] sendPingMessage:164 Failed send ping message: X_LINK_ERROR[0m
[35mE: [watchdog] [ 0] sendPingMessage:164 Failed send ping message: X_LINK_ERROR[0m
[35mE: [watchdog] [ 0] sendPingMessage:164 Failed send ping message: X_LINK_ERROR[0m
[35mE: [watchdog] [ 0] sendPingMessage:164 Failed send ping message: X_LINK_ERROR[0m

 

0 Kudos
1 Solution
lilohuang
New Contributor I
1,494 Views

My issue seems to be resolved after I bought a brand new 7 ports AC powered USB hub ($40 USD) instead of using my previous 4 ports AC powered USB hub ($30 USD). The brand new 7 ports AC powered USB hub has a LARGER AC adapter comparing to the 4 ports AC powered USB hub. I’m not 100% sure, but the X_LINK_ERROR seems to be related to the USB controller chip compatibility or USB power system stability. It's still running after 12 hours torture testing w/o any error. Thanks.

https://ncsforum.movidius.com/discussion/1619/watchdog-0-sendpingmessage-164-failed-send-ping-message-x-link-error#latest

View solution in original post

0 Kudos
10 Replies
lilohuang
New Contributor I
1,494 Views

TBH, I feel frustrated with using Intel NCS2 especially due to the unstable status which cannot be used on my 24x7 workload (15~30 fps, TinyYOLOv3). My room temperature is under 30°C. I even don’t know what the error message mean. I've bought two more NCS2 for failover, but I don't think it's the best solution with buying more NCS2. Can anyone escalate the issue to RD core team? Hopefully, they are able to reproduce the error what people encountered through torture test (long running test). Thank you!

Note: some other people encountered the similar error https://ncsforum.movidius.com/discussion/comment/4726#Comment_4726

0 Kudos
lilohuang
New Contributor I
1,495 Views

My issue seems to be resolved after I bought a brand new 7 ports AC powered USB hub ($40 USD) instead of using my previous 4 ports AC powered USB hub ($30 USD). The brand new 7 ports AC powered USB hub has a LARGER AC adapter comparing to the 4 ports AC powered USB hub. I’m not 100% sure, but the X_LINK_ERROR seems to be related to the USB controller chip compatibility or USB power system stability. It's still running after 12 hours torture testing w/o any error. Thanks.

https://ncsforum.movidius.com/discussion/1619/watchdog-0-sendpingmessage-164-failed-send-ping-message-x-link-error#latest

0 Kudos
Lee__Hanbeen
Beginner
1,494 Views

I have a similar issue..

One model works very well, but another model doesn't work.

I don't know what i should do. 

Do you think the only way to solve this problem is to use a different cable?

I am not sure why one model works well and one model has X_LINK_ERROR in the same environment.

0 Kudos
lilohuang
New Contributor I
1,494 Views

My solution is to buy a powerful 7 ports AC-powered USB hub which has a LARGE AC adapter rather than using USB Y-cable or small AC-powered USB hub. The error may occur after a long running test even directly connecting NCS2 to my laptop or desktop. I guess the power consumption is critical for NCS2, using a POWERFUL AC-powered USB hub seems to be required. Now I'm using TP-LINK UH720 USB hub connecting with two NCS2 devices simultaneously. No error occurred after running 18 hours heavy loading inferencing. Just FYR. Thanks!

0 Kudos
Lee__Terry
Beginner
1,494 Views

I have the very similar issue. Thanks for sharing your experience.

I will try to a good USB hub power by usb type C.

Since my application needed to work 24/7 and the systems are installed at remote location, reliability is critical.

I'm looking for ways for the system to recover when error occurred.

eg error: E: [watchdog] [         0] sendPingMessage:164     Failed send ping message: X_LINK_ERROR

Is there anyway to reset the state?

I tried reset/create the         

async_infer_request_next = network.CreateInferRequestPtr();
async_infer_request_curr = network.CreateInferRequestPtr();

I'm not able to get it working. The message continue showing up. Is there anyway to reset the thread the produce the message ?

Environment : NCS2,openvino r5, windows 10,  object_detection_demo_ssd_async.

Any thoughts and suggestion would be greatly appreciated.

Thanks,

Terry 

 

0 Kudos
lilohuang
New Contributor I
1,494 Views

Recreating the async inferencing pointers won't work well. Before I fixed the problem with a powerful AC-powered USB hub, my workaround was creating a dedicated child process for NCS2 detection, and communicating with that child process through the so-called IPC (inter-process communication), and killing the child process if no response while error occurred, and restarting the child process after sleeping a while (few seconds). It worked to me but very ugly. 

BTW, now I plugged two NCS2 devices on a powerful 7 ports AC-powered USB hub (TP-LINK UH720), no issue with 24x7 work load so far. I tested it with a small 4 ports AC-powered USB hub which DIDN'T work well before replacing with the 7 ports powerful AC-powered USB hub. It means some weak USB hubs even with AC-powered might not work well.

Lee, Terry wrote:

I have the very similar issue. Thanks for sharing your experience.

I will try to a good USB hub power by usb type C.

Since my application needed to work 24/7 and the systems are installed at remote location, reliability is critical.

I'm looking for ways for the system to recover when error occurred.

eg error: [35mE: [watchdog] [         0] sendPingMessage:164     Failed send ping message: X_LINK_ERROR[0m

Is there anyway to reset the state?

I tried reset/create the         

async_infer_request_next = network.CreateInferRequestPtr();
async_infer_request_curr = network.CreateInferRequestPtr();

I'm not able to get it working. The message continue showing up. Is there anyway to reset the thread the produce the message ?

Environment : NCS2,openvino r5, windows 10,  object_detection_demo_ssd_async.

Any thoughts and suggestion would be greatly appreciated.

Thanks,

Terry 

 

0 Kudos
Lee__Terry
Beginner
1,494 Views

Thanks for the work around.

I did some temperature stress test on NCS2 and found it become unstable when the ambient temperature at around 50 cellies. I'm wondering if you have done any temperature stress test with   AC-powered USB hub (TP-LINK UH720). 

Thanks,

Terry

 

0 Kudos
lilohuang
New Contributor I
1,494 Views

My room temperature is controlled under 30°C, so I didn't hit the unstable problem due to temperature. According to Sahira_at_Intel's comment in https://ncsforum.movidius.com/discussion/1106/ncs-temperature-issue I guess you need to get an active cooling system.

Hi @lilohuang

The maximum temperature for the NCS is about 70°C, with the ideal temperature being between 0°C to 40°C. With OpenVINO, you are not able to get the device temperature (that feature is only available with the NCSDK), so I don't think you would be able to output an error message when the device heated to a certain temperature.

Sincerely,
Sahira

 

Lee, Terry wrote:

Thanks for the work around.

I did some temperature stress test on NCS2 and found it become unstable when the ambient temperature at around 50 cellies. I'm wondering if you have done any temperature stress test with   AC-powered USB hub (TP-LINK UH720). 

Thanks,

Terry

 

0 Kudos
Carlyon__Shane
Beginner
1,494 Views

Hi,  @lilohuang

 

I am using UP AI Core X on Ubuntu 16.04 - https://up-shop.org/featured/261-up-ai-core-x.html

 

I am getting this log after running an object detection engine:

E: [xLink] [    450217] dispatcherEventReceive:368      dispatcherEventReceive() Read failed -4 | event 0x7f0e9dd00ee0 XLINK_READ_REL_REQ

E: [xLink] [    450217] eventReader:230 eventReader stopped
E: [xLink] [    450218] XLinkReadDataWithTimeOut:1377   Event data is invalid
E: [ncAPI] [    450218] ncFifoReadElem:3313     Packet reading is failed.
E: [watchdog] [    450386] sendPingMessage:132  Failed send ping message: X_LINK_ERROR
E: [watchdog] [    451385] sendPingMessage:132  Failed send ping message: X_LINK_ERROR
E: [watchdog] [    452387] sendPingMessage:132  Failed send ping message: X_LINK_ERROR
E: [watchdog] [    453387] sendPingMessage:132  Failed send ping message: X_LINK_ERROR
E: [watchdog] [    454389] sendPingMessage:132  Failed send ping message: X_LINK_ERROR
E: [watchdog] [    455388] sendPingMessage:132  Failed send ping message: X_LINK_ERROR
E: [watchdog] [    456387] sendPingMessage:132  Failed send ping message: X_LINK_ERROR
E: [watchdog] [    457386] sendPingMessage:132  Failed send ping message: X_LINK_ERROR
E: [watchdog] [    458386] sendPingMessage:132  Failed send ping message: X_LINK_ERROR
E: [watchdog] [    459385] sendPingMessage:132  Failed send ping message: X_LINK_ERROR
E: [watchdog] [    460385] sendPingMessage:132  Failed send ping message: X_LINK_ERROR
E: [watchdog] [    461384] sendPingMessage:132  Failed send ping message: X_LINK_ERROR
E: [watchdog] [    462384] sendPingMessage:132  Failed send ping message: X_LINK_ERROR
E: [watchdog] [    462385] watchdog_routine:327 [0x7f0e98086e00] device, not respond, removing from watchdog

Any idea?

0 Kudos
Sahira_Intel
Moderator
1,494 Views

Hi Shane,

I've responded to your other thread here: https://software.intel.com/en-us/forums/computer-vision/topic/814373

 

Thanks so much,

Sahira 

0 Kudos
Reply