Software Archive
Read-only legacy content
17061 Discussions

MPSS Service Locks up server

Carlson__Joel
Beginner
926 Views

Hey guys,
Reposting this as it didn't appear my last post posted. I recently built a server and to this point have been unable to get my Phi card working in it. Below are my server specs as well as some debug information. I am running CentOS 6.4 with the stock kernel. I am able to see the card and run some of the utilities however, not all the information is filled in when running micinfo. I am unable to flash the card, start the mpss service (at least not when Autostart is enabled), or basically boot the card as it causes the server to lock up. The card is seen and reports ready with I run a micctrl -s but if I try anything with the card it hangs the server. Could anyone provide any assistance? Thanks,
Joel

Server Specs:
Motherboard: ASUS P9X79-E WS
Processor: Xeon E5-2620
Power Supply: Corsair HX850

[root@Phi1 bin]# lspci |grep 225
03:00.0 Co-processor: Intel Corporation Device 2250 (rev 11)
[root@Phi1 bin]# lsmod|grep mic
mic 583839 0
[root@Phi1 bin]# micctrl -s
mic0: ready
[root@Phi1 bin]# micctrl -rwf
mic0: resetting
mic0: ready
[root@Phi1 bin]# micinfo
MicInfo Utility Log

Created Tue Jul 2 20:55:06 2013

System Info
HOST OS : Linux
OS Version : 2.6.32-358.el6.x86_64
Driver Version : 6720-15
MPSS Version : 2.1.6720-15
Host Physical Memory : 32826 MB

Device No: 0, Device Name: mic0

Version
Flash Version : NotAvailable
SMC Firmware Version : NotAvailable
SMC Boot Loader Version : NotAvailable
uOS Version : NotAvailable
Device Serial Number : NotAvailable

Board
Vendor ID : 0x8086
Device ID : 0x2250
Subsystem ID : 0x2500
Coprocessor Stepping ID : 3
PCIe Width : x16
PCIe Speed : 5 GT/s
PCIe Max payload size : 256 bytes
PCIe Max read req size : 512 bytes
Coprocessor Model : 0x01
Coprocessor Model Ext : 0x00
Coprocessor Type : 0x00
Coprocessor Family : 0x0b
Coprocessor Family Ext : 0x00
Coprocessor Stepping : B1
Board SKU : NotAvailable
ECC Mode : NotAvailable
SMC HW Revision : NotAvailable

Cores
Total No of Active Cores : NotAvailable
Voltage : NotAvailable
Frequency : NotAvailable

Thermal
Fan Speed Control : NotAvailable
Fan RPM : NotAvailable
Fan PWM : NotAvailable
Die Temp : NotAvailable

GDDR
GDDR Vendor : NotAvailable
GDDR Version : NotAvailable
GDDR Density : NotAvailable
GDDR Size : NotAvailable
GDDR Technology : NotAvailable
GDDR Speed : NotAvailable
GDDR Frequency : NotAvailable
GDDR Voltage : NotAvailable

0 Kudos
6 Replies
robert-reed
Valued Contributor II
926 Views

Coprocessor ready is not the same as Intel MPSS running.  Have you tried doing a "service mpss start" before running micinfo?

0 Kudos
Frances_R_Intel
Employee
926 Views

As Robert points out, some of the information in micinfo doesn't show up until you start the mpss. But lets back up a bit. 

There is a flow chart on troubleshooting hardware vs. software problems - http://software.intel.com/sites/default/files/forum/393956/intelr-mpss-for-linux-troubleshoot-flow-chart.pdf - but it sounds like you have at least started down that path. One of the boxes on the mpss service won't start path asks you to check the message log on the host for messages showing that your host BIOS isn't configured for large BAR. That would be a good thing to check. Also, since you are using CentOS rather than straight RHEL, did you recompile the mic kernel module? You will need to do that - the directions are in the readme-xx.txt files that come with the MPSS. And what error message do you get when you can't flash the coprocessor?

0 Kudos
Carlson__Joel
Beginner
926 Views

I am unable to start the mpss service. The machine immediately locks up upon doing so. I have tried following every step of the flowchart which always leads me to the blue box eventually. I have tried with the RHEL6 kernel module as well as recompiling the kernel module from the source rpm.  Both lead me to the same result. I don't get any error messages when trying to flash the coprocessor. As soon as I run the command the machine immediately freezes. I am using the following command although I have also tried to provide the firmware file in the command as well. Both produce the same results. Thanks for getting back with me so quickly. I can start the mpss if I disable autostart. I don't know if that buys me anything.

micflash -update -device all

0 Kudos
Frances_R_Intel
Employee
926 Views

It looks like your earlier post just showed up. You attached a tar file of some relevant pieces in there, so I can try looking through that. But it really bugs me that the host hangs when you try to update flash. Can you go back to ASUS and ask them if they know of any reason that the coprocessor wouldn't work with the ASUS P9X79-E WS motherboard. Looking at their web site I see that the ASUS P9X79 WS will work with the coprocessor (although it says active fan SKU only) but it doesn't say anything on the ASUS P9X79-E WS page.

0 Kudos
Carlson__Joel
Beginner
926 Views

Thanks Frances,

I submitted a ticket with ASUS earlier this morning but haven't gotten a reply. I am pretty doubtful that they are going to be able to help me. I am kind of leaning towards the motherboard being the problem myself. The motherboard page doesn't say it supports the 5 series of the Phi card although I have seen one other person on these forums that has gotten it working with the ASUS P9X79 WS motherboard. Maybe I bought the wrong version and need to revert back to that version of the motherboard. Thanks and let me know if you see anything in the logs I attached. 

0 Kudos
Carlson__Joel
Beginner
926 Views

Hey guys, 

I  wanted to follow up on this thread in case anyone else runs into this problem later. I was finally able to get the Phi card working. After trying to get it working to no avail I installed SLES 11 SP2 to verify the problem still happened and so I could rule out software being the problem. After getting everything installed I tried starting the MPSS service and low and behold it still locked up. I then decided to downgrade the bios. After downgrading the bios the server would no longer boot so I had to remove the card boot to bios enable Phi support and reinstall the card. I booted back into the os and tried starting the service and it started perfectly. I am unsure what fixed the problem exactly but it must have been one of two things. Either the newest bios does not work with the Phi card or there was one other setting that seemed to have changed, PCIE Spread Spectrum. That option is now disabled but was previously enabled. 

Now I just have to figure out how to use this thing.

0 Kudos
Reply