- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My server has 4x Intel Xeon Phi 5110P accelerator cards. it runs Centos 6.5 with kernel version 2.6.32-431.29.2.el6.x86_64
When updating MPSS from 2.1 to 3.3.4 and 3.4.3, I receive the following error:
[root@XXXXX mpss-3.3.4]# /usr/bin/micflash -update -device all -smcbootloader
Error getting SCIF driver version
failed to open mic'0': /sys/class/mic/mic0/family: Knights Corner: not supported: Operation canceled
failed to open mic'1': /sys/class/mic/mic1/family: Knights Corner: not supported: Operation canceled
failed to open mic'2': /sys/class/mic/mic2/family: Knights Corner: not supported: Operation canceled
failed to open mic'3': /sys/class/mic/mic3/family: Knights Corner: not supported: Operation canceled
As the comparison is done on the string in the family file, are the cards hardware-incompatible with newer versions of the MPSS?
The cards have the following configuration -
Thank you!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Zhong,
Can you please verify if you are using the updated Intel® Xeon Phi™ Coprocessor system administration guide. Also I would like to know if you were able to successfully uninstall the previous version before starting to install the newer MPSS version 3.3.x or 3.4.x.
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Zhong,
As Sunny points out, you need to uninstall MPSS 2.x before install MPSS 3.x . To uninstall MPSS 2.x:
# yum remove intel-mic\*
Please refer to the readme file for more information. Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear both
Thank you for your suggestions. Most of my information came from the README files contained within each release.
I have done an `rpm -qa | grep intel-mic*` (or mpss*) to ensure that there was no leftover files from the previous installations. All attempts at uninstallation didn't return error messages. NetworkManager has been stopped, as recommended.
I have attempted to downgrade the installation from MPSS 3.4.3 to 3.3.4 and 3.1.6, but in all cases the same error occurred. The installation would proceed as normal for the host side stack, but fail when I attempt to update the firmware.
I went through the Xeon Phi System Administrator Guide, but the flash FAQ linked to it (https://software.intel.com/sites/default/files/Flash%20FAQ.pdf) does not address this issue, which I'm experiencing.
A check showed that newer devices have "x100" as the value in the /sys/class/mic/mic0/family, instead of "Knight's Corner", which is the case for our mic cards (They are 5110P cards). And perhaps the micflash utility is refusing to proceed based on the value. While it's tempting to rewrite the file, I'm hesitant to do so, in case there's a physical incompatibility and I brick my Xeon Phi cards. As I was unable to uncover documentation on this, I would appreciate it if someone could fill me in.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To be really obsessive about removing all the old MPSS files before doing a clean install, you should use the uninstall script/directions for the version you have installed. For example, run the uninstall script from MPSS 3.1 (and 3.3 and 3.4 if you are really obsessive - you will get a number of errors doing this because all (most?) of the files should already be uninstalled) then go into the MPSS 2.1 directory and run the uninstall script there. Also make sure you uninstall all the OFED rpm files that were installed from the MPSS 2.1 directory and the OpenFabrics release, if you were running OFED. But that is being really obsessive and ultimately, I don't think it will catch any more files that what Sunny and Loc said.
But before you do that, check the log files for error messages.
As I recall when you run micflash, what is happening behind the scene is that the micflash command loads the mic kernel module if it isn't already loaded, starts the mpss daemon and boots the coprocessors. Only then does it actually try updating the flash and smc on the card. So try this -
- service mpss unload
- modprobe mic
- dmesg
- tail /var/log/message
- (look for any errors)
- service mpss start
- check dmesg and /var/log/message again
- micctrl -b
- check dmesg and /var/log/message again
Let us know if you see any error messages. I suspect you will.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
TLDR: Installation worked! Thank you for your help!
The /sys/class/mic/mic0/family value is now "x100"
I think I might have typed in `service mpss stop` instead of unload earlier. I also didn't start the mpss service before micflash (as the docs recommended)
========================================================
Here's a quick list of what I tried.
Uninstallation:
As MPSS 2.1 does not come with an uninstall script, I executed:
- service mpss unload
- yum removed intel-mic*
as instructed in docs/readme-en.txt.
dmesg reported that the PCI devices were disabled
mic3: Resetting (Post Code 09)
mic0: Resetting (Post Code 09)
mic3: Resetting (Post Code 12)
mic3: Transition from state resetting to ready
mic0: Resetting (Post Code 12)
mic0: Transition from state resetting to ready
mic 0000:09:00.0: PCI INT A disabled
mic 0000:08:00.0: PCI INT A disabled
mic 0000:05:00.0: PCI INT A disabled
mic 0000:04:00.0: PCI INT A disabled
/var/log/messages showed the intel-mic packages being erased.
Apr 9 11:37:07 localhost yum[2218]: Erased: intel-mic-mpm
Apr 9 11:37:07 localhost yum[2218]: Erased: intel-mic-gdb
Apr 9 11:37:08 localhost yum[2218]: Erased: intel-mic-micmgmt
Apr 9 11:37:09 localhost yum[2218]: Erased: intel-mic
Apr 9 11:37:10 localhost yum[2218]: Erased: intel-mic-kmod
Apr 9 11:37:10 localhost yum[2218]: Erased: intel-mic-sysmgmt
Apr 9 11:37:12 localhost yum[2218]: Erased: intel-mic-flash
Apr 9 11:37:12 localhost yum[2218]: Erased: intel-mic-gpl
The rest of the uninstalls from the MPSS 3.1/3.3/3.4 didn't result in anymore messages from either dmesg or /var/log/messages. The MPSS 3.3 and 3.4 uninstall scripts were halted because no arguments were passed to yum remove (no more packages remained installed on the system). After the above steps, it appears that uninstallation has proceeded normally.
Installing MPSS-3.3
tail /var/log/messages - reports names of installed packages (no error messages)
service unload mpss; modprobe mic (no error messages);
dmesg output (identical to /var/log/messages):
vnet: mode: dma, buffers: 62
mic 0000:04:00.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32
mic 0000:04:00.0: setting latency timer to 64
mic 0000:04:00.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32
mic 0000:04:00.0: irq 121 for MSI/MSI-X
mic0: Transition from state ready to resetting
mic_probe 4:0:0 as board #0
mic 0000:05:00.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32
mic 0000:05:00.0: setting latency timer to 64
mic 0000:05:00.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32
mic 0000:05:00.0: irq 122 for MSI/MSI-X
mic1: Transition from state ready to resetting
mic0: Resetting (Post Code 12)
mic0: Transition from state resetting to ready
My Phys addrs: 0x8030270000 and scif_addr 0xc00ff09800
mic_probe 5:0:0 as board #1
mic 0000:08:00.0: PCI INT A -> GSI 40 (level, low) -> IRQ 40
mic 0000:08:00.0: setting latency timer to 64
mic 0000:08:00.0: PCI INT A -> GSI 40 (level, low) -> IRQ 40
mic 0000:08:00.0: irq 123 for MSI/MSI-X
mic2: Transition from state ready to resetting
mic1: Resetting (Post Code 12)
mic1: Transition from state resetting to ready
My Phys addrs: 0x801fb30000 and scif_addr 0xc010526240
mic_probe 8:0:0 as board #2
mic 0000:09:00.0: PCI INT A -> GSI 40 (level, low) -> IRQ 40
mic 0000:09:00.0: setting latency timer to 64
mic 0000:09:00.0: PCI INT A -> GSI 40 (level, low) -> IRQ 40
mic 0000:09:00.0: irq 124 for MSI/MSI-X
mic3: Transition from state ready to resetting
mic2: Resetting (Post Code 12)
mic2: Transition from state resetting to ready
My Phys addrs: 0xffa5c10000 and scif_addr 0xc00ff09140
mic_probe 9:0:0 as board #3
mic: number of devices detected 4
mic3: Resetting (Post Code 12)
mic3: Transition from state resetting to ready
My Phys addrs: 0xffa4aa0000 and scif_addr 0xc022f21200
starting mpss
stdout:
Starting Intel(R) MPSS: [ OK ]
mic0: online (mode: linux image: /usr/share/mpss/boot/bzImage-knightscorner)
mic1: online (mode: linux image: /usr/share/mpss/boot/bzImage-knightscorner)
mic2: online (mode: linux image: /usr/share/mpss/boot/bzImage-knightscorner)
mic3: online (mode: linux image: /usr/share/mpss/boot/bzImage-knightscorner)
dmesg:
mic_probe 9:0:0 as board #3
mic: number of devices detected 4
mic3: Resetting (Post Code 12)
mic3: Transition from state resetting to ready
My Phys addrs: 0xffa4aa0000 and scif_addr 0xc022f21200
mic3: Transition from state ready to booting
mic image: /usr/share/mpss/boot/bzImage-knightscorner
MIC 3 Booting
mic0: Transition from state ready to booting
mic image: /usr/share/mpss/boot/bzImage-knightscorner
MIC 0 Booting
mic1: Transition from state ready to booting
mic image: /usr/share/mpss/boot/bzImage-knightscorner
MIC 1 Booting
mic2: Transition from state ready to booting
mic image: /usr/share/mpss/boot/bzImage-knightscorner
MIC 2 Booting
Waiting for MIC 3 boot 5
Waiting for MIC 0 boot 5
Waiting for MIC 1 boot 5
Waiting for MIC 2 boot 5
Waiting for MIC 3 boot 10
Waiting for MIC 0 boot 10
Waiting for MIC 1 boot 10
Waiting for MIC 2 boot 10
Waiting for MIC 3 boot 15
Waiting for MIC 0 boot 15
Waiting for MIC 1 boot 15
Waiting for MIC 2 boot 15
MIC 3 Network link is up
MIC 0 Network link is up
MIC 1 Network link is up
MIC 2 Network link is up
Waiting for MIC 3 boot 20
Waiting for MIC 0 boot 20
Waiting for MIC 1 boot 20
Waiting for MIC 2 boot 20
mic0: Transition from state booting to online
mic2: Transition from state booting to online
mic3: Transition from state booting to online
Waiting for MIC 1 boot 25
mic1: Transition from state booting to online
micctrl -b returns an error because the cards have already booted.
micctrl -rw; sudo /usr/bin/micflash -update -device all -smcbootloader

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page