Software Archive
Read-only legacy content
17061 Discussions

MPSS 3.4.3 restart failed when try to add an NFS mount entry to mic0

Qi_Z_
Beginner
565 Views

Hi:

the host running on centos 7 has a new kernel Linux 3.10.0-123.20.1.el7.x86_64,

before mpss installation, I recompiled the mpss module and reflashed the firmware to 391 for mpss 3.4.3,

mic and mpss run well by using miccheck, mininfo, systemctl status mpss. 

NFS server has been installed in the host with:  yum install nfs-utils

rpcbind and nsf service are running well. firewalld and NetworkManger have been stopped. SELinux is disabled.

Basically all the steps are followed with <<Intel(R) Manycore Platform Software Stack MPSS 3.4.3 README>> and everything seems OK so far.

 

Then I started to mount NFS to mic0 which follows the steps from the section 5.6.3 in <<System Administration for the Intel® Xeon Phi™ Coprocessor MPSS 3.4 >>

### mounted NFS

1. Append /etc/exports on the NFS server with the line in the format:

       
        /test 172.31.0.0/16(rw,no_root_squash)

2. Add in /etc/hosts.allow on the NFS server

        ALL: 172.31.0.0/16

3. Let NFS know the files have changed by executing

        > exportfs -a

4. Add an NFS mount entry to the Intel® Xeon Phi™ coprocessor’s /etc/fstab file. Execute this 
command on Intel® Xeon Phi™ Coprocessor attached host system.

      
        > micctrl --addnfs=173.31.1.254:/test --dir=/test mic0

this step show some warning messages:

[Warning] mic0: Server 173.31.1.254 may not be reachable if the interface is not routed out of the host

5. restart mpss

        > systemctl restart mpss

mpss restart fails every time in this step even I tried to uninstall mpss and do all the previously steps all over again,

using  > systemctl status mpss.service -l  shows:

mpss.service - Intel(R) MPSS control service
   Loaded: loaded (/etc/systemd/system/mpss.service; enabled)
   Active: failed (Result: exit-code) since Thu 2015-03-05 10:39:43 GMT; 51s ago
  Process: 13867 ExecStop=/etc/init.d/mpss stop (code=exited, status=0/SUCCESS)
  Process: 13968 ExecStart=/etc/init.d/mpss start (code=exited, status=1/FAILURE)
 Main PID: 13218 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/mpss.service

Mar 05 10:39:43 cluster0 mpss[13968]: Starting Intel(R) MPSS: [FAILED]
Mar 05 10:39:43 cluster0 systemd[1]: mpss.service: control process exited, code=exited status=1
Mar 05 10:39:43 cluster0 systemd[1]: Failed to start Intel(R) MPSS control service.
Mar 05 10:39:43 cluster0 systemd[1]: Unit mpss.service entered failed state.

micctrl -s also shows: mic0: boot failed

But after about twenty minutes, mic0 will be back online again and mpss can be restarted. however, when using

>mount in mic0, no mounted entry shows up.

 in host /var/mpss/mic0/etc/fstab and  mic0 /etc/fstab, both list the mounted entry:

173.31.1.254:/test        /test     nfs             nolock          1 1

try > mount -a  in mic0 just hang in there.

># micctrl --config mic0

mic0:
=============================================================
    Config Version: 1.1

    Linux Kernel:   /usr/share/mpss/boot/bzImage-knightscorner
    Map File:       /usr/share/mpss/boot/System.map-knightscorner
    BootOnStart:    Enabled
    Shutdowntimeout: 300 seconds

    ExtraCommandLine: highres=off
    PowerManagment: cpufreq_on;corec6_off;pc3_on;pc6_off

    Root Device:   Dynamic Ram Filesystem /var/mpss/mic0.image.gz from:
        Base:      CPIO /usr/share/mpss/boot/initramfs-knightscorner.cpio.gz
        CommonDir: Directory /var/mpss/common
        Micdir:    Directory /var/mpss/mic0

    Network:       Static Pair
        Hostname:  cluster0-mic0
        MIC IP:    172.31.1.1
        Host IP:   172.31.1.254
        Net Bits:  24
        NetMask:   255.255.255.0
        MtuSize:   64512
        MIC MAC:   4c:79:ba:20:04:e4
        Host MAC:  4c:79:ba:20:04:e5

    LDAP:          Disabled
     NIS:          Disabled

    Cgroup:
        Memory:    Disabled

    Console:        hvc0
    VerboseLogging: Disabled
    CrashDump:      /var/crash/mic 16GB

 

Thanks in advance for any suggestions.

 

QI

 

 

 

 

 

 

0 Kudos
1 Reply
Frances_R_Intel
Employee
565 Views

I'm afraid this question was missed somehow but hopefully, the user found the problem - there is a typo in the configuration. The IP address given to the host side of the host/mic0 connection is 172.31.1.254, but when the nfs mount was added to the coprocessor using micctrl, the address was mis-typed as 173.31.1.254. Uninstalling, then reinstalling the MPSS did not resolve this problem because the configuration files are preserved across MPSS uninstalls and reinstalls - which is what you would want if you are installing a new MPSS. To remove erroneous entry, the 'micctrl --remnfs=<mnt_dir>' command was needed.

0 Kudos
Reply