Software Archive
Read-only legacy content
17061 Discussions

MPSS 3.4 "micctrl --initdefaults" deletes /etc/mpss/micN.conf - SL6.5

Michael_M_1
Beginner
1,216 Views

I have a server with Scientific Linux 6.5 installed that has (2) Xeon Phi cards installed. I am trying to perform the initial setup for the MPSS 3.4 and installation required building from source to generate a mic driver that would load, kernel is:
Linux xpacc-serv-03.csl.illinois.edu 2.6.32-431.11.2.el6.x86_64 #1 SMP Tue Mar 25 11:15:18 CDT 2014 x86_64 x86_64 x86_64 GNU/Linux

modprobe works fine and I performed the microcode update for the cards that came with the MPSS 3.4 distribution, and the cards are updated and ready:

[root@xpacc-serv-03 mpss-3.4]# micctrl -s
mic0: ready
mic1: ready

The /etc/mpss directory has the initial configuration with no setup files yet:

[root@xpacc-serv-03 mpss-3.4]# ls /etc/mpss
conf.d Settings.ini te_doc.t

I ran micctrl --initdefaults -vv, and it seems it should be creating /etc/mpss/default.conf, /etc/mpss/mic0.conf, and /etc/mpss/mic1.conf:

[root@xpacc-serv-03 mpss-3.4]# micctrl --initdefaults -vv
[Filesys] mic0: Created /etc/mpss/default.conf
[Filesys] mic0: Created /etc/mpss/mic0.conf version 1.1
[Info] mic0: File System Base /usr/share/mpss/boot/initramfs-knightscorner.cpio.gz
[Info] mic0: Common files at /var/mpss/common
[Filesys] mic0: Created directory /var/mpss/common
[Info] mic0: Unique files at /var/mpss/mic0
[Filesys] mic0: Created directory /var/mpss/mic0
[Filesys] mic0: Created directory /var/mpss/mic0/etc
[Filesys] mic0: Created directory /var/mpss/mic0/etc/init.d
[Filesys] mic0: Created directory /var/mpss/mic0/etc/rc1.d
[Filesys] mic0: Created directory /var/mpss/mic0/etc/rc5.d
[Filesys] mic0: Created directory /var/mpss/mic0/etc/network
[Filesys] mic0: Created directory /var/mpss/mic0/etc/ssh
[Filesys] mic0: Created directory /var/mpss/mic0/etc/pam.d
[Filesys] mic0: Created directory /var/mpss/mic0/home
[Info] mic1: Using existing /etc/mpss/default.conf
[Filesys] mic1: Created /etc/mpss/mic1.conf version 1.1
[Info] mic1: File System Base /usr/share/mpss/boot/initramfs-knightscorner.cpio.gz
[Info] mic1: Common files at /var/mpss/common
[Info] mic1: Unique files at /var/mpss/mic1
[Filesys] mic1: Created directory /var/mpss/mic1
[Filesys] mic1: Created directory /var/mpss/mic1/etc
[Filesys] mic1: Created directory /var/mpss/mic1/etc/init.d
[Filesys] mic1: Created directory /var/mpss/mic1/etc/rc1.d
[Filesys] mic1: Created directory /var/mpss/mic1/etc/rc5.d
[Filesys] mic1: Created directory /var/mpss/mic1/etc/network
[Filesys] mic1: Created directory /var/mpss/mic1/etc/ssh
[Filesys] mic1: Created directory /var/mpss/mic1/etc/pam.d
[Filesys] mic1: Created directory /var/mpss/mic1/home
[Info] mic0: Hostname xpacc-serv-03-mic0.csl.illinois.edu
[Filesys] mic0: Created /var/mpss/mic0/etc/hostname
[Filesys] mic0: Update MacAddrs in /etc/mpss/mic0.conf
[Info] mic0: Network Static Pair MIC 172.31.1.1 Host 172.31.1.254
[Filesys] mic0: Updated /etc/sysconfig/network-scripts/ifcfg-mic0
[Warning] mic0: Generating compatibility network config file /opt/intel/mic/filesystem/mic0/etc/sysconfig/network/ifcfg-mic0 for IDB.
[Warning] This may be problematic at best and will be removed in a future release, Check with the IDB release.
[Filesys] mic0: Create /var/mpss/mic0/etc/network/interfaces
[Info] mic0: Removing conflicting existing /etc/hosts entry: 172.31.1.1 xpacc-serv-03-mic0.csl.illinois.edu mic0 #Generated-by-micctrl
[Info] mic0: Updated /etc/hosts with 172.31.1.1 xpacc-serv-03-mic0.csl.illinois.edu
[Filesys] mic0: Update Network in /etc/mpss/mic0.conf
[Info] mic1: Hostname xpacc-serv-03-mic1.csl.illinois.edu
[Filesys] mic1: Created /var/mpss/mic1/etc/hostname
[Filesys] mic1: Update MacAddrs in /etc/mpss/mic1.conf
[Info] mic1: Network Static Pair MIC 172.31.2.1 Host 172.31.2.254
[Filesys] mic1: Updated /etc/sysconfig/network-scripts/ifcfg-mic1
[Warning] mic1: Generating compatibility network config file /opt/intel/mic/filesystem/mic1/etc/sysconfig/network/ifcfg-mic0 for IDB.
[Warning] This may be problematic at best and will be removed in a future release, Check with the IDB release.
[Filesys] mic1: Create /var/mpss/mic1/etc/network/interfaces
[Info] mic1: Removing conflicting existing /etc/hosts entry: 172.31.2.1 xpacc-serv-03-mic1.csl.illinois.edu mic1 #Generated-by-micctrl
[Info] mic1: Updated /etc/hosts with 172.31.2.1 xpacc-serv-03-mic1.csl.illinois.edu
[Filesys] mic1: Update Network in /etc/mpss/mic1.conf
[Info] mic0: Verbose mode Disabled
[Info] mic0: Linux OS image /usr/share/mpss/boot/bzImage-knightscorner
System Map /usr/share/mpss/boot/bzImage-knightscorner
[Info] mic0: Boot On Start Enabled
[Info] mic0: Shutdown Timeout 300
[Info] mic0: MIC Crash Dump at /var/crash/mic size 16
[Filesys] mic0: Created /var/mpss/mic0/etc/fstab
[Filesys] mic0: Created /var/mpss/mic0/etc/passwd
[Filesys] mic0: Created /var/mpss/mic0/etc/shadow
[Filesys] mic0: Created /var/mpss/mic0/etc/group
[Filesys] mic0: Add user 'root' ID 0 GID 0 to /var/mpss/mic0/etc/passwd
[Filesys] mic0: Created directory /var/mpss/mic0/root
[Filesys] mic0: Created /var/mpss/mic0/root/.profile
[Filesys] mic0: Created directory /var/mpss/mic0/root/.ssh
[Filesys] mic0: Created /var/mpss/mic0/root/.ssh/id_rsa from /root/.ssh/id_rsa
[Filesys] mic0: Created /var/mpss/mic0/root/.ssh/id_rsa.pub from /root/.ssh/id_rsa.pub
[Filesys] mic0: Created /var/mpss/mic0/root/.ssh/authorized_keys from /root/.ssh/authorized_keys
[Filesys] mic0: Add user 'sshd' ID 74 GID 74 to /var/mpss/mic0/etc/passwd
[Filesys] mic0: Add user 'nobody' ID 99 GID 99 to /var/mpss/mic0/etc/passwd
[Filesys] mic0: Add user 'nfsnobody' ID 65534 GID 65534 to /var/mpss/mic0/etc/passwd
[Filesys] mic0: Add user 'micuser' ID 0 GID 400 to /var/mpss/mic0/etc/passwd
[Filesys] mic0: Created directory /var/mpss/mic0/home/micuser
[Filesys] mic0: Created /var/mpss/mic0/home/micuser/.profile
[Filesys] mic0: Add group 'root' GID 0 to /var/mpss/mic0/etc/group
[Filesys] mic0: Add group 'sshd' GID 74 to /var/mpss/mic0/etc/group
[Filesys] mic0: Add group 'nobody' GID 99 to /var/mpss/mic0/etc/group
[Filesys] mic0: Add group 'micuser' GID 400 to /var/mpss/mic0/etc/group
[Filesys] mic0: Add group 'nfsnobody' GID 65534 to /var/mpss/mic0/etc/group
[Filesys] mic0: Created /var/mpss/mic0/etc/resolv.conf
[Filesys] mic0: Created /var/mpss/mic0/etc/nsswitch.conf
[Filesys] mic0: Created /var/mpss/mic0/etc/pam.d/common-auth
[Filesys] mic0: Created /var/mpss/mic0/etc/pam.d/common-account
[Filesys] mic0: Created /var/mpss/mic0/etc/pam.d/common-session
[Filesys] /var/mpss/mic0/etc/ssh/ssh_host_key: Created rsa1 keys
[Filesys] /var/mpss/mic0/etc/ssh/ssh_host_rsa_key: Created rsa keys
[Filesys] /var/mpss/mic0/etc/ssh/ssh_host_dsa_key: Created dsa keys
[Filesys] mic0: Created /var/mpss/mic0/etc/localtime
[Info] mic0: ExtraCommandLine 'highres=off'
[Filesys] mic0: Update RootDevice in /etc/mpss/mic0.conf
[Info] mic0: RootDevice RAMFS /var/mpss/mic0.image.gz
[Info] mic0: Console hvc0
[Info] mic0: PowerManagement cpufreq_on;corec6_on;pc3_on;pc6_on
[Info] mic0: Cgroup memory=disabled
[Info] mic1: Verbose mode Disabled
[Info] mic1: Linux OS image /usr/share/mpss/boot/bzImage-knightscorner
System Map /usr/share/mpss/boot/bzImage-knightscorner
[Info] mic1: Boot On Start Enabled
[Info] mic1: Shutdown Timeout 300
[Info] mic1: MIC Crash Dump at /var/crash/mic size 16
[Filesys] mic1: Created /var/mpss/mic1/etc/fstab
[Filesys] mic1: Created /var/mpss/mic1/etc/passwd
[Filesys] mic1: Created /var/mpss/mic1/etc/shadow
[Filesys] mic1: Created /var/mpss/mic1/etc/group
[Filesys] mic1: Add user 'root' ID 0 GID 0 to /var/mpss/mic1/etc/passwd
[Filesys] mic1: Created directory /var/mpss/mic1/root
[Filesys] mic1: Created /var/mpss/mic1/root/.profile
[Filesys] mic1: Created directory /var/mpss/mic1/root/.ssh
[Filesys] mic1: Created /var/mpss/mic1/root/.ssh/id_rsa from /root/.ssh/id_rsa
[Filesys] mic1: Created /var/mpss/mic1/root/.ssh/id_rsa.pub from /root/.ssh/id_rsa.pub
[Filesys] mic1: Created /var/mpss/mic1/root/.ssh/authorized_keys from /root/.ssh/authorized_keys
[Filesys] mic1: Add user 'sshd' ID 74 GID 74 to /var/mpss/mic1/etc/passwd
[Filesys] mic1: Add user 'nobody' ID 99 GID 99 to /var/mpss/mic1/etc/passwd
[Filesys] mic1: Add user 'nfsnobody' ID 65534 GID 65534 to /var/mpss/mic1/etc/passwd
[Filesys] mic1: Add user 'micuser' ID 0 GID 400 to /var/mpss/mic1/etc/passwd
[Filesys] mic1: Created directory /var/mpss/mic1/home/micuser
[Filesys] mic1: Created /var/mpss/mic1/home/micuser/.profile
[Filesys] mic1: Add group 'root' GID 0 to /var/mpss/mic1/etc/group
[Filesys] mic1: Add group 'sshd' GID 74 to /var/mpss/mic1/etc/group
[Filesys] mic1: Add group 'nobody' GID 99 to /var/mpss/mic1/etc/group
[Filesys] mic1: Add group 'micuser' GID 400 to /var/mpss/mic1/etc/group
[Filesys] mic1: Add group 'nfsnobody' GID 65534 to /var/mpss/mic1/etc/group
[Filesys] mic1: Created /var/mpss/mic1/etc/resolv.conf
[Filesys] mic1: Created /var/mpss/mic1/etc/nsswitch.conf
[Filesys] mic1: Created /var/mpss/mic1/etc/pam.d/common-auth
[Filesys] mic1: Created /var/mpss/mic1/etc/pam.d/common-account
[Filesys] mic1: Created /var/mpss/mic1/etc/pam.d/common-session
[Filesys] /var/mpss/mic1/etc/ssh/ssh_host_key: Created rsa1 keys
[Filesys] /var/mpss/mic1/etc/ssh/ssh_host_rsa_key: Created rsa keys
[Filesys] /var/mpss/mic1/etc/ssh/ssh_host_dsa_key: Created dsa keys
[Filesys] mic1: Created /var/mpss/mic1/etc/localtime
[Info] mic1: ExtraCommandLine 'highres=off'
[Filesys] mic1: Update RootDevice in /etc/mpss/mic1.conf
[Info] mic1: RootDevice RAMFS /var/mpss/mic1.image.gz
[Info] mic1: Console hvc0
[Info] mic1: PowerManagement cpufreq_on;corec6_on;pc3_on;pc6_on
[Info] mic1: Cgroup memory=disabled
[root@xpacc-serv-03 mpss-3.4]#

However, only default.conf is created:

[root@xpacc-serv-03 mpss-3.4]# ls -la /etc/mpss
total 32
drwxr-xr-x. 3 root root 4096 Oct 9 14:37 .
drwxr-xr-x. 139 root root 12288 Oct 9 14:37 ..
drwxr-xr-x. 2 root root 4096 Sep 18 17:42 conf.d
-rw-r--r--. 1 root root 602 Oct 9 14:37 default.conf
-rwxr-xr-x. 1 root root 1860 Sep 18 17:54 Settings.ini
-rwxr-xr-x. 1 root root 1962 Sep 18 17:54 te_doc.t
[root@xpacc-serv-03 mpss-3.4]#

I attempted to create the files in case there was some problem with permissions, although the fact that it can create the default.conf seems to indicate there are no problems (Selinux is disabled, but thought I would try this anyway).

[root@xpacc-serv-03 mpss-3.4]# touch /etc/mpss/mic0.conf
[root@xpacc-serv-03 mpss-3.4]# touch /etc/mpss/mic1.conf
[root@xpacc-serv-03 mpss-3.4]# ls -la /etc/mpss
total 32
drwxr-xr-x. 3 root root 4096 Oct 9 14:40 .
drwxr-xr-x. 139 root root 12288 Oct 9 14:37 ..
drwxr-xr-x. 2 root root 4096 Sep 18 17:42 conf.d
-rw-r--r--. 1 root root 602 Oct 9 14:37 default.conf
-rw-------. 1 root root 0 Oct 9 14:40 mic0.conf
-rw-------. 1 root root 0 Oct 9 14:40 mic1.conf
-rwxr-xr-x. 1 root root 1860 Sep 18 17:54 Settings.ini
-rwxr-xr-x. 1 root root 1962 Sep 18 17:54 te_doc.t
[root@xpacc-serv-03 mpss-3.4]#

Running "micctrl --initdefaults -vv" now indicates it is updating the existing configuration files for the mic cards:

[root@xpacc-serv-03 mpss-3.4]# micctrl --initdefaults -vv
[Info] mic0: Using existing /etc/mpss/default.conf
[Info] mic0: Using existing /etc/mpss/mic0.conf
[Info] mic0: Updateing Version parameter to 1.1
[Info] mic0: File System Base /usr/share/mpss/boot/initramfs-knightscorner.cpio.gz
[Info] mic0: Common files at /var/mpss/common
[Info] mic0: Unique files at /var/mpss/mic0
[Info] mic1: Using existing /etc/mpss/default.conf
[Info] mic1: Using existing /etc/mpss/mic1.conf
[Info] mic1: Updateing Version parameter to 1.1
[Info] mic1: File System Base /usr/share/mpss/boot/initramfs-knightscorner.cpio.gz
[Info] mic1: Common files at /var/mpss/common
[Info] mic1: Unique files at /var/mpss/mic1
[Info] mic0: Hostname xpacc-serv-03-mic0.csl.illinois.edu
[Filesys] mic0: Update MacAddrs in /etc/mpss/mic0.conf
[Info] mic0: Network Static Pair MIC 172.31.1.1 Host 172.31.1.254
[Filesys] mic0: Updated /etc/sysconfig/network-scripts/ifcfg-mic0
[Warning] mic0: Generating compatibility network config file /opt/intel/mic/filesystem/mic0/etc/sysconfig/network/ifcfg-mic0 for IDB.
[Warning] This may be problematic at best and will be removed in a future release, Check with the IDB release.
[Filesys] mic0: Update /var/mpss/mic0/etc/network/interfaces
[Info] mic0: Removing conflicting existing /etc/hosts entry: 172.31.1.1 xpacc-serv-03-mic0.csl.illinois.edu mic0 #Generated-by-micctrl
[Info] mic0: Updated /etc/hosts with 172.31.1.1 xpacc-serv-03-mic0.csl.illinois.edu
[Filesys] mic0: Update Network in /etc/mpss/mic0.conf
[Info] mic1: Hostname xpacc-serv-03-mic1.csl.illinois.edu
[Filesys] mic1: Update MacAddrs in /etc/mpss/mic1.conf
[Info] mic1: Network Static Pair MIC 172.31.2.1 Host 172.31.2.254
[Filesys] mic1: Updated /etc/sysconfig/network-scripts/ifcfg-mic1
[Warning] mic1: Generating compatibility network config file /opt/intel/mic/filesystem/mic1/etc/sysconfig/network/ifcfg-mic0 for IDB.
[Warning] This may be problematic at best and will be removed in a future release, Check with the IDB release.
[Filesys] mic1: Update /var/mpss/mic1/etc/network/interfaces
[Info] mic1: Removing conflicting existing /etc/hosts entry: 172.31.2.1 xpacc-serv-03-mic1.csl.illinois.edu mic1 #Generated-by-micctrl
[Info] mic1: Updated /etc/hosts with 172.31.2.1 xpacc-serv-03-mic1.csl.illinois.edu
[Filesys] mic1: Update Network in /etc/mpss/mic1.conf
[Info] mic0: Verbose mode Disabled
[Info] mic0: Linux OS image /usr/share/mpss/boot/bzImage-knightscorner
System Map /usr/share/mpss/boot/bzImage-knightscorner
[Info] mic0: Boot On Start Enabled
[Info] mic0: Shutdown Timeout 300
[Info] mic0: MIC Crash Dump at /var/crash/mic size 16
[Info] mic0: ExtraCommandLine ''
[Filesys] mic0: Update RootDevice in /etc/mpss/mic0.conf
[Info] mic0: RootDevice RAMFS /var/mpss/mic0.image.gz
[Info] mic0: Console hvc0
[Info] mic0: PowerManagement cpufreq_on;corec6_on;pc3_on;pc6_on
[Info] mic0: Cgroup memory=disabled
[Info] mic1: Verbose mode Disabled
[Info] mic1: Linux OS image /usr/share/mpss/boot/bzImage-knightscorner
System Map /usr/share/mpss/boot/bzImage-knightscorner
[Info] mic1: Boot On Start Enabled
[Info] mic1: Shutdown Timeout 300
[Info] mic1: MIC Crash Dump at /var/crash/mic size 16
[Info] mic1: ExtraCommandLine ''
[Filesys] mic1: Update RootDevice in /etc/mpss/mic1.conf
[Info] mic1: RootDevice RAMFS /var/mpss/mic1.image.gz
[Info] mic1: Console hvc0
[Info] mic1: PowerManagement cpufreq_on;corec6_on;pc3_on;pc6_on
[Info] mic1: Cgroup memory=disabled
[root@xpacc-serv-03 mpss-3.4]#

However, instead of updating the micN.conf files, they have now disappeared:

[root@xpacc-serv-03 mpss-3.4]# ls -la /etc/mpss/
total 32
drwxr-xr-x. 3 root root 4096 Oct 9 14:42 .
drwxr-xr-x. 139 root root 12288 Oct 9 14:42 ..
drwxr-xr-x. 2 root root 4096 Sep 18 17:42 conf.d
-rw-r--r--. 1 root root 602 Oct 9 14:37 default.conf
-rwxr-xr-x. 1 root root 1860 Sep 18 17:54 Settings.ini
-rwxr-xr-x. 1 root root 1962 Sep 18 17:54 te_doc.t
[root@xpacc-serv-03 mpss-3.4]#


The User Guide documentation also seems to indicate that I should have a /etc/sysconfig/mpss file as well, but that does not exist either:

[root@xpacc-serv-03 mpss-3.4]# ls -la /etc/sysconfig/
total 260
drwxr-xr-x. 7 root root 4096 Oct 8 15:40 .
drwxr-xr-x. 139 root root 12288 Oct 9 14:42 ..
-rw-r--r--. 1 root root 403 Oct 7 05:04 atd
-rw-r-----. 1 root root 647 May 28 11:05 auditd
-rw-r--r--. 1 root root 395 Sep 29 09:59 authconfig
-rw-r--r--. 1 root root 4091 Jun 9 05:07 autofs
drwxr-xr-x. 2 root root 4096 Sep 29 09:54 cbq
-rw-r--r--. 1 root root 50 Nov 3 2011 cfengine3
-rw-r--r--. 1 root root 486 Sep 30 08:05 cgconfig
-rw-r--r--. 1 root root 950 Sep 30 08:05 cgred.conf
-rw-r--r--. 1 root root 23 Sep 29 09:59 clock
drwxr-xr-x. 2 root root 4096 Sep 3 08:06 console
-rw-r--r--. 1 root root 2651 Aug 13 2013 cpuspeed
-rw-------. 1 root root 110 Nov 21 2013 crond
-rw-r--r--. 1 root root 16 Sep 29 09:59 desktop
-rw-r--r--. 1 root root 16 Sep 29 09:59 firstboot
-rw-r--r--. 1 root root 25 Sep 29 09:59 grub
-rw-r--r--. 1 root root 5824 Nov 21 2013 hsqldb
-rw-r--r--. 1 root root 529 Jul 18 01:27 htcacheclean
-rw-r--r--. 1 root root 947 Jul 18 01:27 httpd
-rw-r--r--. 1 root root 47 Sep 29 09:59 i18n
-rw-r--r--. 1 root root 1154 Sep 3 08:06 init
-rw-------. 1 root root 1988 Nov 21 2013 ip6tables-config
-rw-------. 1 root root 1974 Nov 21 2013 iptables-config
-rw-r--r--. 1 root root 903 May 5 03:17 irqbalance
-rw-r--r--. 1 root root 1212 Sep 15 11:05 kdump
-rw-r--r--. 1 root root 180 Sep 29 15:01 kernel
-rw-r--r--. 1 root root 63 Sep 29 09:59 keyboard
-rw-r--r--. 1 root root 271 Nov 21 2013 mcelogd
drwxr-xr-x. 2 root root 4096 Oct 9 11:30 modules
-rw-r--r--. 1 root root 634 Sep 3 08:06 netconsole
-rw-r--r--. 1 root root 55 Sep 29 09:51 network
drwxr-xr-x. 4 root root 4096 Nov 24 2010 networking
drwxr-xr-x. 2 root root 4096 Oct 9 14:42 network-scripts
-rw-r--r--. 1 root root 1744 Jul 2 08:05 nfs


I have been searching the blogs and posts for a resolution, but I don't see anything that seems to be related to this issue.

Any assistance and suggestions for how to resolve this would be greatly appreciated.

Thanks,
Mike Marks

0 Kudos
16 Replies
Michael_M_1
Beginner
1,216 Views

This is an issue with setting up a system with two MIC cards/Xeon Phi using MPSS 3.4. The "micctrl --initdefaults" command seems to delete any /etc/mpss/micN.conf files rather than updating/creating them.

0 Kudos
TaylorIoTKidd
New Contributor I
1,216 Views

Mike,

Are you saying this is an error? Or are you just noting the cause and solution?

--
Taylor

 

0 Kudos
Michael_M_1
Beginner
1,216 Views

Hi Taylor,

This is an error, the micctrl program will not create the /etc/mpss/micN.conf files even though the verbose log indicates that it is creating them. micctrl does create the default.conf, but not the micN.conf files, so the cards are ready but offline. I created a mic0.conf and mic1.conf myself and I am able to load the provided knight's corner image, but if I run micctrl --initdefaults, deletes the micN.conf files instead of updating them:

[root@xpacc-serv-03 ~]# cd /etc/mpss
[root@xpacc-serv-03 mpss]# ls -la
total 40
drwxr-xr-x.   3 root root  4096 Oct 14 16:31 .
drwxr-xr-x. 139 root root 12288 Oct 15 09:58 ..
drwxr-xr-x.   2 root root  4096 Sep 18 17:42 conf.d
-rw-r--r--.   1 root root   602 Oct 14 16:30 default.conf
-rw-------.   1 root root  2369 Oct 14 16:31 mic0.conf
-rw-------.   1 root root  2369 Oct 14 16:31 mic1.conf
-rwxr-xr-x.   1 root root  1860 Sep 18 17:54 Settings.ini
-rwxr-xr-x.   1 root root  1962 Sep 18 17:54 te_doc.t
[root@xpacc-serv-03 mpss]# micctrl -s
mic0: online (mode: linux image: /usr/share/mpss/boot/bzImage-knightscorner)
mic1: online (mode: linux image: /usr/share/mpss/boot/bzImage-knightscorner)
[root@xpacc-serv-03 mpss]# service mpss stop
Shutting down Intel(R) MPSS:                               [  OK  ]
[root@xpacc-serv-03 mpss]# micctrl -s
mic0: ready
mic1: ready
[root@xpacc-serv-03 mpss]# micctrl --initdefaults -vv
   [Info] mic0: Using existing /etc/mpss/default.conf
   [Info] mic0: Using existing /etc/mpss/mic0.conf
   [Info] mic0: File System Base /usr/share/mpss/boot/initramfs-knightscorner.cpio.gz
   [Info] mic0: Common files at /var/mpss/common
   [Info] mic0: Unique files at /var/mpss/mic0
   [Info] mic1: Using existing /etc/mpss/default.conf
   [Info] mic1: Using existing /etc/mpss/mic1.conf
   [Info] mic1: File System Base /usr/share/mpss/boot/initramfs-knightscorner.cpio.gz
   [Info] mic1: Common files at /var/mpss/common
   [Info] mic1: Unique files at /var/mpss/mic1
   [Info] mic0: Hostname xpacc-serv-03-mic0.csl.illinois.edu
[Filesys] mic0: Remove /etc/sysconfig/network-scripts/ifcfg-mic0
[Filesys] mic0: Update /etc/hosts remove mic0
[Filesys] mic0: Update /etc/hosts remove hostmic0
[Filesys] mic0: Created /etc/sysconfig/network-scripts/ifcfg-mic0
[Warning] mic0: Generating compatibility network config file /opt/intel/mic/filesystem/mic0/etc/sysconfig/network/ifcfg-mic0 for IDB.
[Warning]       This may be problematic at best and will be removed in a future release, Check with the IDB release.
[Filesys] mic0: Update /var/mpss/mic0/etc/network/interfaces
   [Info] mic0: Removing conflicting existing /etc/hosts entry: 172.31.1.1      xpacc-serv-03-mic0.csl.illinois.edu mic0 #Generated-by-micctrl
   [Info] mic0: Updated /etc/hosts with 172.31.1.1 xpacc-serv-03-mic0.csl.illinois.edu
   [Info] mic1: Hostname xpacc-serv-03-mic1.csl.illinois.edu
[Filesys] mic1: Remove /etc/sysconfig/network-scripts/ifcfg-mic1
[Filesys] mic1: Update /etc/hosts remove mic1
[Filesys] mic1: Update /etc/hosts remove hostmic1
[Filesys] mic1: Created /etc/sysconfig/network-scripts/ifcfg-mic1
[Warning] mic1: Generating compatibility network config file /opt/intel/mic/filesystem/mic1/etc/sysconfig/network/ifcfg-mic0 for IDB.
[Warning]       This may be problematic at best and will be removed in a future release, Check with the IDB release.
[Filesys] mic1: Update /var/mpss/mic1/etc/network/interfaces
   [Info] mic1: Removing conflicting existing /etc/hosts entry: 172.31.2.1      xpacc-serv-03-mic1.csl.illinois.edu mic1 #Generated-by-micctrl
   [Info] mic1: Updated /etc/hosts with 172.31.2.1 xpacc-serv-03-mic1.csl.illinois.edu
   [Info] mic0: Verbose mode Disabled
   [Info] mic0: Linux OS image /usr/share/mpss/boot/bzImage-knightscorner
                System Map /usr/share/mpss/boot/bzImage-knightscorner
   [Info] mic0: Boot On Start Enabled
   [Info] mic0: Shutdown Timeout 300
   [Info] mic0: MIC Crash Dump at /var/crash/mic size 16
   [Info] mic0: ExtraCommandLine ''highres=off''
   [Info] mic0: RootDevice RamFS /var/mpss/mic0.image.gz
   [Info] mic0: Console hvc0
   [Info] mic0: PowerManagement cpufreq_on;corec6_on;pc3_on;pc6_on
   [Info] mic0: Cgroup memory=disabled
   [Info] mic1: Verbose mode Disabled
   [Info] mic1: Linux OS image /usr/share/mpss/boot/bzImage-knightscorner
                System Map /usr/share/mpss/boot/bzImage-knightscorner
   [Info] mic1: Boot On Start Enabled
   [Info] mic1: Shutdown Timeout 300
   [Info] mic1: MIC Crash Dump at /var/crash/mic size 16
   [Info] mic1: ExtraCommandLine ''highres=off''
   [Info] mic1: RootDevice RamFS /var/mpss/mic1.image.gz
   [Info] mic1: Console hvc0
   [Info] mic1: PowerManagement cpufreq_on;corec6_on;pc3_on;pc6_on
   [Info] mic1: Cgroup memory=disabled
[root@xpacc-serv-03 mpss]# ls -la
total 32
drwxr-xr-x.   3 root root  4096 Oct 15 10:00 .
drwxr-xr-x. 139 root root 12288 Oct 15 10:00 ..
drwxr-xr-x.   2 root root  4096 Sep 18 17:42 conf.d
-rw-r--r--.   1 root root   602 Oct 14 16:30 default.conf
-rwxr-xr-x.   1 root root  1860 Sep 18 17:54 Settings.ini
-rwxr-xr-x.   1 root root  1962 Sep 18 17:54 te_doc.t
[root@xpacc-serv-03 mpss]#

This is the contents of the mic0.conf I created by hand based on the userguide and the outputs above:

[root@xpacc-serv-03 mpss]# cat mic0.conf
Version 1 1
Include default.conf
Include conf.d/*.conf
OSimage /usr/share/mpss/boot/bzImage-knightscorner /usr/share/mpss/boot/System.map-knightscorner
BootOnStart enabled
PowerManagement cpufreq_on;corec6_on;pc3_on;pc6_on
ExtraCommandLine 'highres=off'
Cgroup memory=disabled
ShutdownTimeout 300
Base CPIO /usr/share/mpss/boot/initramfs-knightscorner.cpio.gz
RootDevice RAMFS /var/mpss/mic0.image.gz
CommonDir /var/mpss/common
MicDir /var/mpss/mic0
Hostname xpacc-serv-03-mic0.csl.illinois.edu
MacAddrs Serial
Network class=StaticPair micip=172.31.1.1 hostip=173.31.1.254 mtu=64k netbits=24

 

I am at least able to get the cards online now and ssh as root to the cards. I am going to try to use micctrl to add other ssh users and setup the NFS mounts.

Thanks,

Mike Marks

0 Kudos
Michael_M_1
Beginner
1,216 Views

I forgot to include that I have also tried the MPSS 3.3.1 and MPSS 3.3, and I get the same behavior with "micctrl --initdefaults". The host is a Dell R720 with (2) 10 core processors and 128 Gb of ram. The processor information is as follows:

processor       : 39
vendor_id       : GenuineIntel
cpu family      : 6
model           : 62
model name      : Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
stepping        : 4
cpu MHz         : 2799.846
cache size      : 25600 KB
physical id     : 1
siblings        : 20
core id         : 12
cpu cores       : 10
apicid          : 57
initial apicid  : 57
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
bogomips        : 5599.17
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

Thanks,

Mike Marks

0 Kudos
Michael_M_1
Beginner
1,216 Views

Hi Taylor,

This is an error, the default configuration files are not generated, and existing /etc/mpss/micN.conf files are deleted whenever I run "micctrl --initdefaults". As noted previously, I also see this behavior with MPSS 3.3.1 and MPSS 3.3. I did have to build the mpss-modules* to match our kernel for Scientific Linux 6.5 (2.6.32-431.11.2.el6.x86_64 #1 SMP) to be the mic driver to load. I don't know if that would be related to this issue, if there is some response from the driver that causes micctrl to think there is a comm error with the cards.

-Mike Marks

0 Kudos
Frances_R_Intel
Employee
1,216 Views

You rebuilt only mpss-modules-*.src.rpm, right? None of the other files?

Things kind of start off bad. What are the files Settings.ini and te_doc.t in /etc/mpss? Those files shouldn't exist in the first place and I am not sure at this point where they came from. I can't find those file names anywhere in the Linux tar file. I think if we figure out where those files came from, we will be on track for figuring out why the conf files disappear.

0 Kudos
Michael_M_1
Beginner
1,216 Views

Hi Frances,

You are correct, I rebuilt only the mpss-modules-*.src.rpm in the mpss-3.4/src directory.

I built the mpss modules and copied them from /root/rpmbuild/RPMs/x86_64 to the mpss-3.4/ directory. Then I run "yum install *.rpm" in the mpss-3.4/. The Settings.ini and te_doc.t files and the conf.d directory exist after the "yum install" (before I run "micctrl --initdefaults").

Thanks,

Mike marks

0 Kudos
Paul_P_3
Novice
1,216 Views

Those two files appeared after my upgrade to 3.4 as well on my cluster:

> pdsh -w compute[001-047] ls /etc/mpss/ | dshbak -c
----------------
compute[001-047]
----------------
conf.d
default.conf
ipoib.conf
mic0.conf
mic1.conf
mic2.conf
mic3.conf
Settings.ini
te_doc.t

It looks like mpss-sysmgmt-micdiagnostic is the culprit:

> rpm -qpl mpss-sysmgmt-micdiagnostic-3.4-1.glibc2.12.2.x86_64.rpm | egrep "Settings.ini|te_doc.t"
warning: mpss-sysmgmt-micdiagnostic-3.4-1.glibc2.12.2.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID 76627d3e: NOKEY
/etc/mpss/Settings.ini
/etc/mpss/te_doc.t

0 Kudos
TaylorIoTKidd
New Contributor I
1,216 Views

Hi Mike,

I need a little more information about your board. Please run micinfo. Under the Device, there's a Board category. Under it is a "Board SKU" field. Please pass me the contents of this field.

Device No: # => Board => Board SKU

Regards
--
Taylor
 

0 Kudos
Michael_M_1
Beginner
1,216 Views

Hi Taylor,

Here is the dump from micinfo:

[marksma@xpacc-serv-03 ~]$ micinfo
MicInfo Utility Log
Created Wed Oct 15 15:38:33 2014


        System Info
                HOST OS                 : Linux
                OS Version              : 2.6.32-431.11.2.el6.x86_64
                Driver Version          : 3.4-1
                MPSS Version            : 3.4
                Host Physical Memory    : 132110 MB

Device No: 0, Device Name: mic0

        Version
                Flash Version            : 2.1.02.0390
                SMC Firmware Version     : 1.16.5078
                SMC Boot Loader Version  : 1.8.4326
                uOS Version              : 2.6.38.8+mpss3.4
                Device Serial Number     : ADKC42900350

        Board
                Vendor ID                : 0x8086
                Device ID                : 0x225c
                Subsystem ID             : 0x7d95
                Coprocessor Stepping ID  : 2
                PCIe Width               : Insufficient Privileges
                PCIe Speed               : Insufficient Privileges
                PCIe Max payload size    : Insufficient Privileges
                PCIe Max read req size   : Insufficient Privileges
                Coprocessor Model        : 0x01
                Coprocessor Model Ext    : 0x00
                Coprocessor Type         : 0x00
                Coprocessor Family       : 0x0b
                Coprocessor Family Ext   : 0x00
                Coprocessor Stepping     : C0
                Board SKU                : C0PRQ-7120 P/A/X/D
                ECC Mode                 : Enabled
                SMC HW Revision          : Product 300W Passive CS

        Cores
                Total No of Active Cores : 61
                Voltage                  : 0 uV
                Frequency                : 1238095 kHz

        Thermal
                Fan Speed Control        : N/A
                Fan RPM                  : N/A
                Fan PWM                  : N/A
                Die Temp                 : 36 C

        GDDR
                GDDR Vendor              : Samsung
                GDDR Version             : 0x6
                GDDR Density             : 4096 Mb
                GDDR Size                : 15872 MB
                GDDR Technology          : GDDR5
                GDDR Speed               : 5.500000 GT/s
                GDDR Frequency           : 2750000 kHz
                GDDR Voltage             : 1501000 uV

Device No: 1, Device Name: mic1

        Version
                Flash Version            : 2.1.02.0390
                SMC Firmware Version     : 1.16.5078
                SMC Boot Loader Version  : 1.8.4326
                uOS Version              : 2.6.38.8+mpss3.4
                Device Serial Number     : ADKC42900318

        Board
                Vendor ID                : 0x8086
                Device ID                : 0x225c
                Subsystem ID             : 0x7d95
                Coprocessor Stepping ID  : 2
                PCIe Width               : Insufficient Privileges
                PCIe Speed               : Insufficient Privileges
                PCIe Max payload size    : Insufficient Privileges
                PCIe Max read req size   : Insufficient Privileges
                Coprocessor Model        : 0x01
                Coprocessor Model Ext    : 0x00
                Coprocessor Type         : 0x00
                Coprocessor Family       : 0x0b
                Coprocessor Family Ext   : 0x00
                Coprocessor Stepping     : C0
                Board SKU                : C0PRQ-7120 P/A/X/D
                ECC Mode                 : Enabled
                SMC HW Revision          : Product 300W Passive CS

        Cores
                Total No of Active Cores : 61
                Voltage                  : 1022000 uV
                Frequency                : 1238095 kHz

        Thermal
                Fan Speed Control        : N/A
                Fan RPM                  : N/A
                Fan PWM                  : N/A
                Die Temp                 : 60 C

        GDDR
                GDDR Vendor              : Samsung
                GDDR Version             : 0x6
                GDDR Density             : 4096 Mb
                GDDR Size                : 15872 MB
                GDDR Technology          : GDDR5
                GDDR Speed               : 5.500000 GT/s
                GDDR Frequency           : 2750000 kHz
                GDDR Voltage             : 1501000 uV
[marksma@xpacc-serv-03 ~]$

 

0 Kudos
Michael_M_1
Beginner
1,216 Views

Hi Taylor,

I ran this as root in case the missing information is what you need:

[root@xpacc-serv-03 ~]# micinfo
MicInfo Utility Log
Created Wed Oct 15 15:41:42 2014


        System Info
                HOST OS                 : Linux
                OS Version              : 2.6.32-431.11.2.el6.x86_64
                Driver Version          : 3.4-1
                MPSS Version            : 3.4
                Host Physical Memory    : 132110 MB

Device No: 0, Device Name: mic0

        Version
                Flash Version            : 2.1.02.0390
                SMC Firmware Version     : 1.16.5078
                SMC Boot Loader Version  : 1.8.4326
                uOS Version              : 2.6.38.8+mpss3.4
                Device Serial Number     : ADKC42900350

        Board
                Vendor ID                : 0x8086
                Device ID                : 0x225c
                Subsystem ID             : 0x7d95
                Coprocessor Stepping ID  : 2
                PCIe Width               : x16
                PCIe Speed               : 5 GT/s
                PCIe Max payload size    : 256 bytes
                PCIe Max read req size   : 512 bytes
                Coprocessor Model        : 0x01
                Coprocessor Model Ext    : 0x00
                Coprocessor Type         : 0x00
                Coprocessor Family       : 0x0b
                Coprocessor Family Ext   : 0x00
                Coprocessor Stepping     : C0
                Board SKU                : C0PRQ-7120 P/A/X/D
                ECC Mode                 : Enabled
                SMC HW Revision          : Product 300W Passive CS

        Cores
                Total No of Active Cores : 61
                Voltage                  : 0 uV
                Frequency                : 1238095 kHz

        Thermal
                Fan Speed Control        : N/A
                Fan RPM                  : N/A
                Fan PWM                  : N/A
                Die Temp                 : 42 C

        GDDR
                GDDR Vendor              : Samsung
                GDDR Version             : 0x6
                GDDR Density             : 4096 Mb
                GDDR Size                : 15872 MB
                GDDR Technology          : GDDR5
                GDDR Speed               : 5.500000 GT/s
                GDDR Frequency           : 2750000 kHz
                GDDR Voltage             : 1501000 uV

Device No: 1, Device Name: mic1

        Version
                Flash Version            : 2.1.02.0390
                SMC Firmware Version     : 1.16.5078
                SMC Boot Loader Version  : 1.8.4326
                uOS Version              : 2.6.38.8+mpss3.4
                Device Serial Number     : ADKC42900318

        Board
                Vendor ID                : 0x8086
                Device ID                : 0x225c
                Subsystem ID             : 0x7d95
                Coprocessor Stepping ID  : 2
                PCIe Width               : x16
                PCIe Speed               : 5 GT/s
                PCIe Max payload size    : 256 bytes
                PCIe Max read req size   : 512 bytes
                Coprocessor Model        : 0x01
                Coprocessor Model Ext    : 0x00
                Coprocessor Type         : 0x00
                Coprocessor Family       : 0x0b
                Coprocessor Family Ext   : 0x00
                Coprocessor Stepping     : C0
                Board SKU                : C0PRQ-7120 P/A/X/D
                ECC Mode                 : Enabled
                SMC HW Revision          : Product 300W Passive CS

        Cores
                Total No of Active Cores : 61
                Voltage                  : 0 uV
                Frequency                : 1238095 kHz

        Thermal
                Fan Speed Control        : N/A
                Fan RPM                  : N/A
                Fan PWM                  : N/A
                Die Temp                 : 46 C

        GDDR
                GDDR Vendor              : Samsung
                GDDR Version             : 0x6
                GDDR Density             : 4096 Mb
                GDDR Size                : 15872 MB
                GDDR Technology          : GDDR5
                GDDR Speed               : 5.500000 GT/s
                GDDR Frequency           : 2750000 kHz
                GDDR Voltage             : 1501000 uV
[root@xpacc-serv-03 ~]#


Thanks,

Mike Marks

0 Kudos
TaylorIoTKidd
New Contributor I
1,216 Views

Mike,

It's filed as a bug. The reference # is 5161861.

Regards
--
Taylor


 

0 Kudos
JOHNNIE_P_Intel1
Employee
1,216 Views

I am downloading the Scientific Linux release.  I'll probably get back to you in a couple of days.

0 Kudos
Michael_M_1
Beginner
1,216 Views

Thank you

Mike Marks

0 Kudos
Nathaniel_H_Intel
1,216 Views

Workaround: make sure the TMPDIR env variable points to the same file system that /etc/mpss/ resides on, or mount /var/tmp to the same file system.

0 Kudos
kecoro
Beginner
1,216 Views

sory, come listen and learn with this discussion

0 Kudos
Reply