- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Since we tried to use nfs export dir as RootDevice of Xeon Phi card.
According to https://software.intel.com/sites/default/files/article/373934/system-administration-for-the-intel-xeon-phi-coprocessor.pdf, we made changes as below.
It seems booting, however /dev/* is not usable at mic0 so we can't login via SSH nor screen /dev/ttyMIC0
[host] # micctrl --rootdev=NFS -v -c -d -t [host_ip_addr]:/srv/nfs/mic0 mic0 [host] # ls -al /srv/nfs/mic0 total 32 drwxr-xr-x 18 root root 4096 Oct 16 23:18 . drwxr-xr-x 4 root root 28 Oct 16 23:17 .. drwxr-xr-x 2 root root 4096 Oct 16 23:06 bin drwxr-xr-x 2 root root 6 Oct 16 23:06 boot drwxr-xr-x 2 root root 4096 Oct 16 23:06 dev drwxr-xr-x 31 root root 4096 Oct 16 23:18 etc drwxr-sr-x 4 root root 31 Oct 16 23:06 home -rwxr-xr-x 1 root root 3946 Oct 16 23:06 init drwxr-xr-x 3 root root 20 Oct 16 23:06 lib drwxr-xr-x 5 root root 4096 Sep 19 07:51 lib64 drwxr-xr-x 10 root root 94 Oct 16 23:06 media drwxr-xr-x 2 root root 57 Oct 16 23:06 mnt drwxr-xr-x 3 root root 18 Oct 16 23:06 opt drwxr-xr-x 2 root root 6 Oct 16 23:06 proc -rw-r--r-- 1 polkitd polkitd 62 Oct 16 23:17 .profile drwx------ 2 root root 21 Oct 16 23:17 root drwxr-xr-x 2 root root 4096 Oct 16 23:18 sbin drwxr-xr-x 2 root root 6 Oct 16 23:06 sys lrwxrwxrwx 1 root root 8 Oct 16 23:18 tmp -> /var/tmp drwxr-xr-x 12 root root 123 Sep 19 08:32 usr drwxr-xr-x 8 root root 142 Sep 19 07:51 var [host] # cat /etc/exports /srv/nfs/mic0 [mic0_ip_addr]/32(rw,sync,no_root_squash) [host] # micctrl -s mic0 mic0: online (mode: linux image: /usr/share/mpss/boot/bzImage-knightscorner) [host] # service mpss start [host] # tail -n f /var/log/messages Oct 16 23:17:55 ccx10 systemd: Starting Intel(R) MPSS control service... Oct 16 23:17:56 ccx10 kernel: mic0: Transition from state ready to booting Oct 16 23:17:56 ccx10 kernel: mic image: /usr/share/mpss/boot/bzImage-knightscorner Oct 16 23:17:56 ccx10 kernel: MIC 0 Booting Oct 16 23:18:01 ccx10 kernel: Waiting for MIC 0 boot 5 Oct 16 23:18:06 ccx10 kernel: Waiting for MIC 0 boot 10 Oct 16 23:18:11 ccx10 kernel: Waiting for MIC 0 boot 15 Oct 16 23:18:16 ccx10 kernel: Waiting for MIC 0 boot 20 Oct 16 23:18:21 ccx10 kernel: Waiting for MIC 0 boot 25 Oct 16 23:18:22 ccx10 kernel: MIC 0 Network link is up Oct 16 23:18:22 ccx10 kernel: br0: port 1(mic0) entered forwarding state Oct 16 23:18:22 ccx10 kernel: br0: port 1(mic0) entered forwarding state Oct 16 23:18:22 ccx10 rpc.mountd[30856]: authenticated mount request from [mic0_ip_addr]:969 for /srv/nfs/mic0 (/srv/nfs/mic0) [host] # tail -f /srv/nfs/mic0/var/log/messages Oct 16 23:53:58 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 16 23:54:08 ccx11 user.warn kernel: [ 2164.337626] Host state not PC0 Oct 16 23:54:08 ccx11 daemon.info init: Id "0" respawning too fast: disabled for 5 minutes Oct 16 23:59:09 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 16 23:59:19 ccx11 user.warn kernel: [ 2475.812577] Host state not PC0 Oct 16 23:59:19 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 16 23:59:29 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 16 23:59:40 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 16 23:59:50 ccx11 user.warn kernel: [ 2506.029530] Host state not PC0 Oct 16 23:59:50 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 17 00:00:00 ccx11 auth.err getty: /dev/hvc0: No such file or directory [host] # ping -c 5 [mic0_ip_addr] PING [mic0_ip_addr] ([mic0_ip_addr]) 56(84) bytes of data. 64 bytes from [mic0_ip_addr]: icmp_seq=1 ttl=64 time=0.524 ms 64 bytes from [mic0_ip_addr]: icmp_seq=2 ttl=64 time=0.424 ms 64 bytes from [mic0_ip_addr]: icmp_seq=3 ttl=64 time=0.398 ms 64 bytes from [mic0_ip_addr]: icmp_seq=4 ttl=64 time=0.268 ms 64 bytes from [mic0_ip_addr]: icmp_seq=5 ttl=64 time=0.226 ms --- [mic0_ip_addr] ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4000ms rtt min/avg/max/mdev = 0.226/0.368/0.524/0.108 ms
Could someone know the correct way to setup NFS export as a root of mic card?
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you post the content of /var/mpss/mic0/etc/fstab after the configuration of rootdev with micctrl ?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you tell me how you captured the files that you put in /srv/nfs/mic0?
I know that the current (MPSS 3.x) recommendation for making a cpio file to use for the StaticRAM boot option is boot the coprocessor using the default RAM file system, make any changes you want, then on the host, execute:
ssh root@mic0 "cd / ; find . /dev -xdev ! -path "./etc/modprobe.d*" ! -path "./var/volatile/run*" | cpio -o -H newc | gzip -9" > /usr/share/mpss/boot/custom.cpio.gz
Notice that in addition to . (which is / in this case), you list /dev as a search directory (then exclude the runtime files and current kernel module loads). By default, I don't think find walks /dev. Could this be your problem?
If I were going to make an NFS file system for the coprocessor's root, I think I would boot the coprocessor with the default RAM root file system, log into the coprocessor, make a directory /mnt, mount /srv/nfs/mic0 under /mnt then:
find . /dev -xdev ! -path "./etc/modprobe.d*" ! -path "./var/volatile/run*" | cpio -pdumv /mnt
And be very careful not to forget the -xdev on the find, since I am copying to a mount point under /.
But we probably have cluster administrators out there who have better solutions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
Thanks for the comment.
I captured the log from NFS-server.
<pre>tail -f /srv/nfs/mic0/var/log/messages</pre>
I followed
<pre>find . /dev -xdev ! -path "./etc/modprobe.d*" ! -path "./var/volatile/run*" | cpio -pdumv /mnt</pre>
However after applied this command and reboot, nothing has changed.
# tail -n 100 /srv/nfs/mic0/var/log/messages Oct 20 18:45:05 ccx11 user.warn kernel: [ 29.638606] Module blcr_imports loaded at 0xffffffffa0024000 Oct 20 18:45:05 ccx11 user.warn kernel: [ 29.669338] Module blcr loaded at 0xffffffffa00a2000 Oct 20 18:45:05 ccx11 user.info kernel: [ 29.690853] blcr: vmadump: (from bproc-"4.0.0pre8") Erik Hendriks <erik@hendriks.cx> Oct 20 18:45:05 ccx11 user.info kernel: [ 29.690887] blcr: vmadump: Modified for blcr 0.8.5 <http://ftg.lbl.gov/checkpoint> Oct 20 18:45:05 ccx11 user.info kernel: [ 29.690912] blcr: Berkeley Lab Checkpoint/Restart (BLCR) module version 0.8.5. Oct 20 18:45:05 ccx11 user.info kernel: [ 29.690933] blcr: Parameter cr_io_max = 0x4000000 Oct 20 18:45:05 ccx11 user.info kernel: [ 29.690948] blcr: Supports kernel interface version 0.10.3. Oct 20 18:45:05 ccx11 user.info kernel: [ 29.690967] blcr: Supports context file format versions 8 though 9. Oct 20 18:45:05 ccx11 user.info kernel: [ 29.690986] blcr: http://ftg.lbl.gov/checkpoint Oct 20 18:45:05 ccx11 user.warn kernel: [ 29.805207] MPSSBOOT Boot acknowledged Oct 20 18:45:05 ccx11 daemon.info init: open(/dev/tty0): No such file or directory Oct 20 18:45:05 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 20 18:45:16 ccx11 daemon.info init: open(/dev/tty0): No such file or directory Oct 20 18:45:16 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 20 18:45:26 ccx11 daemon.info init: open(/dev/tty0): No such file or directory Oct 20 18:45:26 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 20 18:45:37 ccx11 daemon.info init: open(/dev/tty0): No such file or directory Oct 20 18:45:37 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 20 18:45:47 ccx11 daemon.info init: open(/dev/tty0): No such file or directory Oct 20 18:45:47 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 20 18:45:57 ccx11 daemon.info init: open(/dev/tty0): No such file or directory Oct 20 18:45:58 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 20 18:46:08 ccx11 user.warn kernel: [ 92.697559] Host state not PC0 Oct 20 18:46:08 ccx11 daemon.info init: open(/dev/tty0): No such file or directory Oct 20 18:46:08 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 20 18:46:18 ccx11 daemon.info init: open(/dev/tty0): No such file or directory Oct 20 18:46:19 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 20 18:46:29 ccx11 user.warn kernel: [ 113.647744] Host state not PC0 Oct 20 18:46:29 ccx11 daemon.info init: open(/dev/tty0): No such file or directory Oct 20 18:46:29 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 20 18:46:39 ccx11 daemon.info init: open(/dev/tty0): No such file or directory Oct 20 18:46:40 ccx11 auth.err getty: /dev/hvc0: No such file or directory Oct 20 18:46:50 ccx11 user.warn kernel: [ 134.585774] Host state not PC0 Oct 20 18:46:50 ccx11 daemon.info init: Id "0" respawning too fast: disabled for 5 minutes Oct 20 18:46:50 ccx11 daemon.info init: no more processes left in this runlevel Oct 20 18:47:32 ccx11 user.notice shutdown[4936]: shutting down for system halt Oct 20 18:47:33 ccx11 daemon.info init: Switching to runlevel: 0 Oct 20 18:47:33 ccx11 daemon.info automount[4885]: autofs stopped Oct 20 18:47:33 ccx11 syslog.info syslogd exiting
It would be great if anyone have succeeded to let the Phi card to have NFS as a root and tell us how to do this.
Thanks in advance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you post the content of /var/mpss/mic0/etc/fstab after the configuration of rootdev with micctrl ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for the comment.
/var/mpss/mic0/etc/fstab doesn't exist. I found /srv/nfs/mic0/etc/fstab but not /var/mpss/.. Here's the content of /srv/nfs/mic0/etc/fstab
# cat /srv/nfs/mic0/etc/fstab # stock fstab - you probably want to override this with a machine specific one rootfs / auto defaults 1 1 proc /proc proc defaults 0 0 devpts /dev/pts devpts mode=0620,gid=5 0 0 usbdevfs /proc/bus/usb usbdevfs noauto 0 0 tmpfs /var/volatile tmpfs defaults 0 0 tmpfs /media/ram tmpfs defaults 0 0 # uncomment this if your device has a SD/MMC/Transflash slot #/dev/mmcblk0p1 /media/card auto defaults,sync,noauto 0 0
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It worked if I create /var/mpss/mic0/etc/fstab and set nfs mount point as a /.
Here's /var/mpss/mic0/etc/fstab
# cat /var/mpss/mic0/etc/fstab #rootfs / auto defaults 1 1 [host_ip_addr]:/srv/nfs/mic0 / nfs rw,hard,intr,tcp,nfsvers=3 1 1 proc /proc proc defaults 0 0 devpts /dev/pts devpts mode=0620,gid=5 0 0
Thanks for the all comments!!!
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page