- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Everyone,
I am currently in the process of investigating why our two MIC cards went offline suddenly over the course of the last week or so (hard to tell as they get only used semi-frequently). As of right now, I cannot ping nor SSH into either card. For some reason, mic0 and mic1 no longer have inet4 addresses.
I have included the micinfo below. Both of the cards are currently online as shown by micctrl --status:
mic0: online (mode: linux image: /usr/share/mpss/boot/bzImage-knightscorner) mic1: online (mode: linux image: /usr/share/mpss/boot/bzImage-knightscorner)
Any help would be greatly appreciated.
Warm Regards,
Joe
MicInfo Utility Log Created Thu Aug 16 14:37:10 2018 System Info HOST OS : Linux OS Version : 2.6.32-431.20.5.el6.x86_64 Driver Version : 3.5.2-1 MPSS Version : 3.5.2 Host Physical Memory : 132124 MB Device No: 0, Device Name: mic0 Version Flash Version : 2.1.02.0391 SMC Firmware Version : 1.17.6900 SMC Boot Loader Version : 1.8.4326 uOS Version : 2.6.38.8+mpss3.5.2 Device Serial Number : ADKC32101046 Board Vendor ID : 0x8086 Device ID : 0x2250 Subsystem ID : 0x2500 Coprocessor Stepping ID : 3 PCIe Width : x16 PCIe Speed : 5 GT/s PCIe Max payload size : 256 bytes PCIe Max read req size : 512 bytes Coprocessor Model : 0x01 Coprocessor Model Ext : 0x00 Coprocessor Type : 0x00 Coprocessor Family : 0x0b Coprocessor Family Ext : 0x00 Coprocessor Stepping : B1 Board SKU : B1PRQ-5110P/5120D ECC Mode : Enabled SMC HW Revision : Product 225W Passive CS Cores Total No of Active Cores : 60 Voltage : 1051000 uV Frequency : 1052631 kHz Thermal Fan Speed Control : N/A Fan RPM : N/A Fan PWM : N/A Die Temp : 42 C GDDR GDDR Vendor : Elpida GDDR Version : 0x1 GDDR Density : 2048 Mb GDDR Size : 7936 MB GDDR Technology : GDDR5 GDDR Speed : 5.000000 GT/s GDDR Frequency : 2500000 kHz GDDR Voltage : 1501000 uV Device No: 1, Device Name: mic1 Version Flash Version : 2.1.02.0391 SMC Firmware Version : 1.17.6900 SMC Boot Loader Version : 1.8.4326 uOS Version : 2.6.38.8+mpss3.5.2 Device Serial Number : ADKC32100888 Board Vendor ID : 0x8086 Device ID : 0x2250 Subsystem ID : 0x2500 Coprocessor Stepping ID : 3 PCIe Width : x16 PCIe Speed : 5 GT/s PCIe Max payload size : 256 bytes PCIe Max read req size : 512 bytes Coprocessor Model : 0x01 Coprocessor Model Ext : 0x00 Coprocessor Type : 0x00 Coprocessor Family : 0x0b Coprocessor Family Ext : 0x00 Coprocessor Stepping : B1 Board SKU : B1PRQ-5110P/5120D ECC Mode : Enabled SMC HW Revision : Product 225W Passive CS Cores Total No of Active Cores : 60 Voltage : 1036000 uV Frequency : 1052631 kHz Thermal Fan Speed Control : N/A Fan RPM : N/A Fan PWM : N/A Die Temp : 41 C GDDR GDDR Vendor : Elpida GDDR Version : 0x1 GDDR Density : 2048 Mb GDDR Size : 7936 MB GDDR Technology : GDDR5 GDDR Speed : 5.000000 GT/s GDDR Frequency : 2500000 kHz GDDR Voltage : 1501000 uV
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
have you tried restarting the MPSS daemon ? I've seen this issue with my KNC cards in the past, where they would no longer be reachable. this always happened if they were left unused for several days/weeks - an 'service mpss restart' usually did the trick.
Also, you're using quite an old version of the mpss stack - I'd recommend to upgrade to the latest version, 3.8.4 ; I have not experienced the network dropouts for almost a year now.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can try the command : sudo micctrl -R to restart it .
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page