Software Archive
Read-only legacy content
17061 Discussions

Connecting coprocessor to bridge to communicate with internet

Eric_B_2
Beginner
1,614 Views

Hello,

I am running Ubuntu 14.04 with a xeon phi 31s1p and I have been trying to set up a bridge so that I can have the phi access the internet, although I have been having a lot of trouble and can't seem to figure out what's wrong. I'm pretty sure the bridge itself is fine but the phi can't connect to it, anytime I try and use the simple command for it to connect to the bridge it gives this:

/var/mpss/mic0/etc/network# micctrl --network=static --bridge=br0 --ip=172.31.1.1
  [Error] br0: Failed - required brctl command not installed

But it is installed...some mess with the mpss tooling.

The output of ifconfig is the following:

br0       Link encap:Ethernet  HWaddr 30:85:a9:a7:d0:06 
          inet addr:192.168.0.100  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::3285:a9ff:fea7:d006/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1768 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1635 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:552300 (552.3 KB)  TX bytes:294748 (294.7 KB)

eth0      Link encap:Ethernet  HWaddr 30:85:a9:a7:bb:27 
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
          Interrupt:17 Memory:f0800000-f0820000

eth1      Link encap:Ethernet  HWaddr 30:85:a9:a7:d0:06 
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:48290 errors:0 dropped:5 overruns:0 frame:0
          TX packets:33720 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:13352617 (13.3 MB)  TX bytes:6396673 (6.3 MB)
          Interrupt:18 Memory:f0700000-f0720000

lo        Link encap:Local Loopback 
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:119 errors:0 dropped:0 overruns:0 frame:0
          TX packets:119 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:18648 (18.6 KB)  TX bytes:18648 (18.6 KB)

mic0      Link encap:Ethernet  HWaddr 4a:79:ba:15:00:21 
          inet addr:172.31.1.254  Bcast:172.31.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING  MTU:1500  Metric:1
          RX packets:4 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2068 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1056 (1.0 KB)  TX bytes:321903 (321.9 KB)

I just need to figure out either whats wrong with this or how to connect phi to internet another way.

Thanks,

-Eric

0 Kudos
4 Replies
Frances_R_Intel
Employee
1,614 Views

It worries me a bit when you say "But it is installed...some mess with the mpss tooling." but don't say what it was. Is brctl installed in a different directory from where micctrl thought it should be or was the problem more serious than that? (On RHEL, it is in /usr/sbin/brctl.) But to the question at hand -

I don't see where you used the 'micctrl --network' command to add the coprocessor to the bridge. The addbridge option sets up the bridge, but does not add any coprocessors to the bridge. (Nor does addbridge add a GATEWAY option to the host bridge configuration file, by the way.) I suspect if you had used the network option with the mic0 address you have, micctrl would have complained because it expects the bridge and the coprocessors to be on the same network.

So, if at all possible, I recommend that you select an address for the coprocessor that is on the same network as the host, then use the 'micctrl --network' command to add the coprocessor to the bridge and create the network interface file for the coprocessor. If you don't use the modcard and modhost options, you will also need to hand modify the /var/mpss/mic0/etc/hosts and /etc/hosts files to make sure the right addresses get in there. You can find the full syntax for the command by using 'micctrl --network -h'.

If you cannot use an address from the network that the host is on, you will need to set up the host as a router, probably not what you really want to do. In that case, you don't need the external bridge. You would use a static pair connection between the host and coprocessor or, if you have multiple cards on a host, an internal bridge.

0 Kudos
Eric_B_2
Beginner
1,614 Views

I worked around with some of the suggestions you have, it seemed to be a configuration problem on the phi, so now the phi is acessable while on the bridge but can't communicate with the internet. The bridge appears to be working properly because the phi can ping the router, but it can't ping and other computers inside or outside the network (while the host can just fine, so not a general internet issue).

 here is some info to help:


root@Xeon-Phi:/home/xeon-phi# micctrl --config

mic0:
=============================================================
    Config Version: 1.1

    Linux Kernel:   /usr/share/mpss/boot/bzImage-knightscorner
    Map File:       /usr/share/mpss/boot/System.map-knightscorner
    BootOnStart:    Enabled
    Shutdowntimeout: 300 seconds

    ExtraCommandLine: highres=off
    PowerManagment: cpufreq_on;corec6_off;pc3_on;pc6_off

    Root Device:   Dynamic Ram Filesystem /var/mpss/mic0.image.gz from:
        Base:      CPIO /usr/share/mpss/boot/initramfs-knightscorner.cpio.gz
        CommonDir: Directory /var/mpss/common
        Micdir:    Directory /var/mpss/mic0

    Network:       Static bridge br0
        MIC IP:    192.168.0.10
        Host IP:   192.168.0.100
        Net Bits:  24
        NetMask:   255.255.255.0
        MtuSize:   1500
        Hostname:  Xeon-Phi-mic0
        MIC MAC:   4c:79:ba:1c:0a:8c
        Host MAC:  4c:79:ba:1c:0a:8d

 


root@Xeon-Phi:/home/xeon-phi# ifconfig

br0       Link encap:Ethernet  HWaddr 30:85:a9:a7:d0:06
          inet addr:192.168.0.100  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::3285:a9ff:fea7:d006/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1698 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1008 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:306656 (306.6 KB)  TX bytes:149073 (149.0 KB)
eth1      Link encap:Ethernet  HWaddr 30:85:a9:a7:d0:06
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:141171 errors:0 dropped:22 overruns:0 frame:0
          TX packets:149982 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:34501467 (34.5 MB)  TX bytes:41845640 (41.8 MB)
          Interrupt:18 Memory:f0700000-f0720000

mic0      Link encap:Ethernet  HWaddr 4c:79:ba:1c:0a:8d
          inet addr:192.168.0.254  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::4e79:baff:fe1c:a8d/64 Scope:Link
          UP BROADCAST RUNNING  MTU:1500  Metric:1
          RX packets:245 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1120 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:32752 (32.7 KB)  TX bytes:190104 (190.1 KB)


on the phi:


[xeon-phi@Xeon-Phi-mic0 ~]$ ifconfig
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

mic0      Link encap:Ethernet  HWaddr 4C:79:BA:1C:0A:8C
          inet addr:192.168.0.10  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING  MTU:1500  Metric:1
          RX packets:1209 errors:0 dropped:457 overruns:0 frame:0
          TX packets:264 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:204272 (199.4 KiB)  TX bytes:34710 (33.8 KiB)


as of now the ping from phi fails:

 [xeon-phi@Xeon-Phi-mic0 ~]$ ping nu.nl
PING nu.nl (62.69.166.254) 56(84) bytes of data.
^C
--- nu.nl ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms


while from the host:


root@Xeon-Phi:/home/xeon-phi# ping nu.nl


PING nu.nl (62.69.166.254) 56(84) bytes of data.
64 bytes from 62-69-166-254.ptr.as24646.net (62.69.166.254): icmp_seq=1 ttl=244 time=92.0 ms
64 bytes from 62-69-166-254.ptr.as24646.net (62.69.166.254): icmp_seq=2 ttl=244 time=92.0 ms


the gateway can ping from phi just fine:


[xeon-phi@Xeon-Phi-mic0 ~]$ ping 192.168.0.1
PING 192.168.0.1 (192.168.0.1) 56(84) bytes of data.
64 bytes from 192.168.0.1: icmp_req=1 ttl=127 time=0.896 ms
64 bytes from 192.168.0.1: icmp_req=2 ttl=127 time=0.797 ms


which is my router, and the dns also seems to be working - see above ping

So i'm quite confused as to how the host can be accessing out, and the phi can access the router, but the phi can't access outside of that...any suggestions?

Thanks,

-Eric

0 Kudos
Frances_R_Intel
Employee
1,614 Views

Ok, so ping can ping the router (192.168.0.1) but not the host (192.168.0.100)? And it can retrieve an ip address from the dns server (probably because it can talk to the router). Weird.

I notice that the eth1 has had to drop a number of received packets. Are those packets for the coprocessor? (Does the count increase if you try to ping someone from the coprocessor?) I'm thinking maybe all the packets from the coprocessor are getting to their destination and the responses are getting back to the ethernet adapter on the host but only the router knows the magic incantation to get past there. What that magic incantation might be, I don't know. But I think I may know what is wrong. There shouldn't be a mic0 interface on the host - either there are vestiges of the old point-to-point host-coprocessor interface in the Linux files or something in the mic0.conf or default.conf file is making micctl think that there is still a static pair connection between the host and coprocessor.

What I suggest is to first try deleting the bridge, adding it in fresh and recreating the network connection from the coprocessor again:

micctrl --delbridge=br0
micctrl --addbridge=br0 --type=external --ip=192.168.0.100
micctrl --network=static --bridge=br0 --ip=192.168.0.10

If that doesn't shake things out, then you will end up changing some configuration files by hand. In /etc/mpss/*.conf, remove any lines that start with:

Network class=StaticPair

Bridge br0 Internal

MacAddress

and make sure there are lines containing:

Bridge br0 External

Network class=StaticBridge bridge=br0

Check the /etc/hosts and /var/mpss/mic0/etc/hosts files and make sure the address for the host is given as 192.168.0.100, the coprocessor is given as 192.168.0.10 and the number 192.168.0.254 doesn't occur anywhere. Finally, remove the mic0 interface configuration file from /etc on the host: /etc/sysconfig/network-scripts/ifcfg-mic0.

Let me know what happens.

0 Kudos
Eric_B_2
Beginner
1,614 Views

Okay, so after trying what you suggested it may be working but I am unsure how to check...since we removed mic0 interface configuration file from the host it can no longer be accessed by the normal method I was using (ssh'ing in with internal ip), is there some other way to log in?

Below are the two config files, perhaps I did something wrong that you can spot?

/etc/mpss/default.conf

# Common /etc files for all embedded Linux file systemsCommonDir /var/mpss/common
ExtraCommandLine "highres=off"
# MIC ConsoleConsole "hvc0"
# MIC Shutdown timeout - Wait for orderly shutdown to complete# via service MPSS stop/unload and micctrl --shutdown or --reboot and --wait# +ve integer -> Time in seconds to wait for shutdown to complete before forcing reset# -ve integer -> Infinite wait for orderly shutdown to complete# 0           -> Forced shutdown or reset. NOT RECOMMENDED!ShutdownTimeout 300
# Storage location and size for MIC kernel crash dumpsCrashDump /var/crash/mic 16
Bridge br0 External 192.168.0.100 24 1500
#Subnet dhcp

 

/etc/mpss/mic0.conf

 

Version 1 1
# Include configuration common to all MIC cardsInclude default.conf
# Include all additional functionality configuration files by defaultInclude "conf.d/*.conf"
# Base filesystem for embedded Linux file systemBase CPIO /usr/share/mpss/boot/initramfs-knightscorner.cpio.gz
# Unique per card files for embedded Linux file systemMicDir /var/mpss/mic0
# Hostname to assign to MIC cardHostName Xeon-Phi-mic0
# MAC address configuration#MacAddrs Serial
#Network class=StaticPair micip=172.31.1.1 hostip=172.31.1.254 mtu=64512 netbits=24 modhost=yes modcard=yes#Network class=StaticPair micip=172.31.1.1 hostip=172.31.1.254 mtu=64512 netbits=24 modhost=yes modcard=yes#Network class=StaticPair micip=172.31.1.1 hostip=172.31.1.254 mtu=64512 netbits=24 modhost=yes modcard=yes#Network class=StaticBridge bridge=br0 micip=172.31.1.1 hostip=172.31.1.254 mtu=1500 netbits=24 modhost=yes modcard=yes




# MIC OS Verbose messages to consoleVerboseLogging Disabled
# MIC OS imageOSimage /usr/share/mpss/boot/bzImage-knightscorner /usr/share/mpss/boot/System.map-knightscorner
# Boot MIC card when MPSS stack is startedBootOnStart Enabled
# Root device for MIC cardRootDevice Ramfs /var/mpss/mic0.image.gz
# Control card power state setting# cpufreq: P state# corec6: Core C6 state# pc3: Package C3 state# pc6: Package C6 statePowerManagement cpufreq_on;corec6_off;pc3_on;pc6_off
Cgroup memory=disabled
Network class=StaticBridge bridge=br0 micip=192.168.0.10 modhost=yes modcard=yes# MAC address configurationMacAddrs Serial

Thanks,

-Eric

0 Kudos
Reply