Software Archive
Read-only legacy content
17061 Discussions

Using MIC on Fedora 20

Sait_U_
Beginner
2,281 Views

 

I am trying to understand how to implement the mpss-3.1.2 software for Fedora 20. I just want to use PHI in native mode on my local machine (DELL Precision T7610 host with dual-12 core xeon cpu's and a 3120A). I have two options:

1. Use mpss-3.1.2

2. Kernel 3.13.5 comes with intel mic drivers, mic_card and mic_host, a sample mpssd, and a micctrl. But  nothing else and no documentation. I will play with this when I understand better the host/mic software requirements.

For using mpss-3.1.2:

I was able to build the kmod package on kernel 3.13.5 (with slight modification of Ahmet Inan's port to 3.12.6). I also built almost all of the rpm's for the host side. Including intel-mic-scif, headers, mpss-daemon (mpssd package), miclib, micmgmt, miccheck. Toook about a day and small tweaks for all, plus had to write spec files since they were not there.

At this point what are the minimal number of packages I need for the MIC card? I know I need the boot package and perhaps the flash package. Which packages are used to load the complete card OS and basic linux commands to the Card. Do I have to recompile these with the k1om compilers? or are the packages in the distribution 3.1.2 are ready to be used for the Card? I would appreciate some quidance in trying to do this which largely stems from my lack of understanding.

I am using Composer XE latest version (Fortran) for compiling and have no problems here, just need to set the card up.

Thanks

0 Kudos
17 Replies
BelindaLiviero
Employee
2,281 Views

You may want to look at the reply to this thread:  http://software.intel.com/en-us/forums/topic/505659#comment-1781756

which delineates the build process in general.  

0 Kudos
Sait_U_
Beginner
2,281 Views

Thanks for the response. What I am really after is a better understanding of the packages in the distribution, specifically, some are on the host side and some for the card. I am only interested in native runs on local machine. I can compile many of the host packages on kernel 3.13.5. If I have mpssd working on the host and install the packages that would be used to boot the card I guess things should work. Presumably I don't have to touch the packages for the card since they have already been compiled and are consistent with each other.

One of my questions is; which ones are the packages for the card? What is the naming convention? I think I can make things work with the above scenario. I don't care about validation, certification etc. I am used to working on experimental systems. Thanks.

0 Kudos
Fuad_O_
Beginner
2,281 Views

I'm trying to do the same.

This is how far i got:

Donloawded Documentation and Skripts from: https://github.com/torvalds/linux/tree/master/Documentation/mic

1. Compile and install mpssd

2. copy skript micctrl to /usr/bin

3. Install service mpss using the provided skript

4. reboot

 

Module "mic_host" is loaded.

When service starts i get this log massage "/var/log/mpss":

Mon Mar 17 11:44:40 2014: MIC Daemon start
Mon Mar 17 11:44:40 2014: MIC name mic0 id 0
Mon Mar 17 11:44:40 2014: MIC found 1 devices
Mon Mar 17 11:44:40 2014: mic0: Opening System.map failed: 2
Mon Mar 17 11:44:40 2014: mic0: mic_config 1432 state offline
Mon Mar 17 11:44:40 2014: mic0: Command line: "clocksource=tsc highres=off nohz=off cpufreq_on;corec6_off;pc3_off;pc6_off ifcfg=static;address,172.31.0.1;netmask,255.255.255.0"
Mon Mar 17 11:44:40 2014: mic0: IPADDR: "172.31.0.1"
Mon Mar 17 11:44:40 2014: Added VIRTIO_ID_CONSOLE for mic0
Mon Mar 17 11:44:40 2014: Added VIRTIO_ID_NET for mic0
Mon Mar 17 11:44:40 2014: mic0 console message goes to /dev/pts/6
Mon Mar 17 11:44:40 2014: Created TAP mic0
Mon Mar 17 11:44:40 2014: Configuring mic0
Mon Mar 17 11:44:40 2014: init_vr mic0 vr0 0x7f585ce48000 vr0->info 0x7f585ce49406 vr_size 0x2000 vring 0x1406 Mon Mar 17 11:44:40 2014: magic 0xc0ffee03 expected 0xc0ffee03
Mon Mar 17 11:44:40 2014: init_vr mic0 vr1 0x7f585ce4a000 vr1->info 0x7f585ce4b406 vr_size 0x2000 vring 0x1406 Mon Mar 17 11:44:40 2014: magic 0xc0ffee04 expected 0xc0ffee04
Mon Mar 17 11:44:40 2014: mic0 get_device_desc d-> type 3 d 0x7f585ce47010
Mon Mar 17 11:44:40 2014: Configuring mic0 ipaddr 172.31.0.254/24
Mon Mar 17 11:44:40 2014: MIC name mic0 tap_configure 226 DONE!
Mon Mar 17 11:44:40 2014: MIC name mic0 id 0
Mon Mar 17 11:44:40 2014: init_vr mic0 vr0 0x7f585ce43000 vr0->info 0x7f585ce44406 vr_size 0x2000 vring 0x1406 Mon Mar 17 11:44:40 2014: magic 0xc0ffee01 expected 0xc0ffee01
Mon Mar 17 11:44:40 2014: init_vr mic0 vr1 0x7f585ce45000 vr1->info 0x7f585ce46406 vr_size 0x2000 vring 0x1406 Mon Mar 17 11:44:40 2014: magic 0xc0ffee02 expected 0xc0ffee02
Mon Mar 17 11:44:40 2014: mic0 get_device_desc d-> type 3 d 0x7f585ce42010
Mon Mar 17 11:44:40 2014: mic0 get_device_desc d-> type 1 d 0x7f585ce42070
Mon Mar 17 11:44:40 2014: mic0 get_device_desc d-> type 3 d 0x7f585ce42010
Mon Mar 17 11:44:40 2014: mic0 get_device_desc d-> type 1 d 0x7f585ce42070
Mon Mar 17 11:44:40 2014: mic0 wait_for_card_driver Waiting .... desc-> type 1 status 0x0

 

$ more /etc/sysconfig/network-scripts/ifcfg-mic0 
DEVICE="mic0"
TYPE=Ethernet
ONBOOT=yes
NM_CONTROLLED="no"
BOOTPROTO=static
IPADDR=172.31.1.254
NETMASK=255.255.255.0
MACADDR=xx:xx:xx:xx:xx:xx

I think "mic0: Opening System.map failed: 2" is the main problem.

I hope this helps you.

If you manage to get to work, please let me know.

 

0 Kudos
Sait_U_
Beginner
2,281 Views

Hi,

Yes, that is one of the problems....nic_host drivers needs to load the linux kernel in to the card and after that presumably mic_card will start to work on the card and communicate with mic_host. For this to happen one needs the kernel compiled for the card somewhere on the host. System.map is part of the kernel package. There are also other things to set up the linux environment on the card that needs to be loaded from the host to card. If all is working on the host side the mic_host driver will do all this. But we do not have the rest of the packages and we can't compile the kernel 3.13.6 for the mic architecture because the gcc we have is not set up for the architecture. 

My problem is that hundreds of news articles announced that support for MIC architecture was coming with Linux 3.13 making it sound like it was going to work out of the box. This does not seem to be the case and there is no place where someone puts the necessary packages to accomplish the task. Even though some people (like myself) would be happy to help debug. Unfortunately my machine came and it had a defective motherboard that should be replaced today or tomorrow. One thing one can try is to go back to mpss-3.1.4 and compile mpssd and some other packages for kernel 3.13.6. This has been done already. But there is no help for doing this either in terms of someone explaining the minimal number of packages to use the card in native mode on local machine. If I figure this out I will post it here. These days I am a bit busy writing a proposal so things out moving slowly.

0 Kudos
Fuad_O_
Beginner
2,281 Views

Hi,

i managed to get it to work, but it is very hacky (mpss 3.1.4).

Here my steps:

1. Download and extract mpss RHEL 6.5: http://registrationcenter.intel.com/irc_nas/3988/mpss-3.1.4-rhel-6.5.tar

2. Install all rpm except mpss-module*. The rpms need to be modified with rpmrebuild. 

eg.: rpmrebuild -ep mpss-daemon-dev-3.1.4-1.glibc2.12.2.x86_64.rpm

a editor opens and every line containing '/usr/lib' or 'usr/bin' need to be removed.

3. Download module source from (i used commit 58):https://github.com/xdsopl/mpss-modules

4. Build and install module with install.sh skript.

5. modprobe mic && micctrl --initdefaults

6. start mpssd daemon

7. connect with ssh

 

I didn't do any benchmarks yet. Hope mpss 3.2 can be installed easier.

Good luck.

0 Kudos
Sait_U_
Beginner
2,281 Views

 

Thank you very much Fuad. I will begin to follow your procedure for mpps-3.2. I recreated the mpss-module rpms from Inan's stuff. I have two questions regarding the procedure:

1. When you say every line containing /usr/lib and /usr/bin should be deleted, I presume you mean the lines defining these top level directories and not the paths to the libraries, e.g. for the above package in 3.2 I have (with rpmrebuild -ep):

%files
%dir %attr(0755, root, root) "/usr"
%dir %attr(0755, root, root) "/usr/include"
%attr(0644, root, root) "/usr/include/mpssconfig.h"
%dir %attr(0755, root, root) "/usr/lib64"
%attr(0777, root, root) "/usr/lib64/libmpssconfig.so"

which ones should be deleted?

2. Did you have to update the flash?

Again, thanks a lot. 

Sait

0 Kudos
Sait_U_
Beginner
2,281 Views

 

Thank you very much Fuad. I will begin to follow your procedure for mpps-3.2. I recreated the mpss-module rpms from Inan's stuff. I have two questions regarding the procedure:

1. When you say every line containing /usr/lib and /usr/bin should be deleted, I presume you mean the lines defining these top level directories and not the paths to the libraries, e.g. for the above package in 3.2 I have (with rpmrebuild -ep):

%files
%dir %attr(0755, root, root) "/usr"
%dir %attr(0755, root, root) "/usr/include"
%attr(0644, root, root) "/usr/include/mpssconfig.h"
%dir %attr(0755, root, root) "/usr/lib64"
%attr(0777, root, root) "/usr/lib64/libmpssconfig.so"

which ones should be deleted?

2. Did you have to update the flash?

Again, thanks a lot. 

Sait

0 Kudos
Fuad_O_
Beginner
2,281 Views

Sait U. wrote:

%dir %attr(0755, root, root) "/usr/lib64"

only this line, you could also try the installation with "rpm -ivh --replacefiles *.rpm" (don't forget to remove module rpm) 

My mic was alrady flashed with mpss 3.1.4

0 Kudos
Sait_U_
Beginner
2,281 Views

Thanks for the info. I presume packages should not own directories that are owned by the official fedora "filesystem". By that token they should not own any of the /usr, /usr/include, etc. Official fedora packages don't own these, they are all owned by filesystem-3.2-19.fc20.x86_64 package.

0 Kudos
Sait_U_
Beginner
2,281 Views

I am having a problem with starting mic module. I downloaded mpss-3.2 RedHat 6.5 packages, I rebuilt all the rpms in the main directory except the mpps-modules*. I installed the rpms and also rebuilt modules from Ahmet Inan's 3.2 port. When I try to load the mic module I get 

modprobe: ERROR: could not insert 'mic': Cannot allocate memory
Starting Intel(R) MPSS: Error getting SCIF driver version 
                                                           [FAILED]

Trying to install module directly with insmod just gives:

insmod: ERROR: could not insert module ./3.13.6-200.fc20.x86_64/extra/mic.ko: Cannot allocate memory

Am I doing something wrong?

0 Kudos
Sait_U_
Beginner
2,281 Views

Never mind.....somehow a reboot fixed the problem....will report the rest later.

0 Kudos
Sait_U_
Beginner
2,281 Views

OK, some progress report...Everything is working on kernel 3.13.6. The mpssd daemon is runnnng, boots the card, etc. I have updated the flash which worked ok (mpss-3.2). In Linux 3.13.x the card is directly identified.The details are below.

I was able to ssh into mic0. I have not run anything yet.  I had some problems as well:

1. The mpss start/stop is not very smooth. They seem to work better after the host reboot. I had one system freeze which I am not sure what it was due to (during reboot). So, I disabled auto restart of mpss and do it manually. How smooth is this supposed to be? Can one start/stop mpssd many times flawlessly?

2. For some reason NetworkManager suddenly put ifcfg-mic0 as the default device and so the host lost its network connection. I removed the default file and now I do "ifup mic0" to bring up the connection.

More

lspci -vvv as:

04:00.0 Co-processor: Intel Corporation Xeon Phi coprocessor 3120 series (rev 20)
        Subsystem: Intel Corporation Device 3608 .........

micinfo gives:

        System Info
                HOST OS                 : Linux
                OS Version              : 3.13.6-200.fc20.x86_64
                Driver Version          : 3.2-1
                MPSS Version            : 3.2

                Host Physical Memory    : 65921 MB

Device No: 0, Device Name: mic0

        Version
                Flash Version            : 2.1.02.0390
                SMC Firmware Version     : 1.16.5078
                SMC Boot Loader Version  : 1.8.4326
                uOS Version              : 2.6.38.8+mpss3.2
                Device Serial Number     : ADKC33000361

        Board
                Vendor ID                : 0x8086
                Device ID                : 0x225d
                Subsystem ID             : 0x3608
                Coprocessor Stepping ID  : 2
                PCIe Width               : x16
                PCIe Speed               : 5 GT/s
                PCIe Max payload size    : 256 bytes
                PCIe Max read req size   : 512 bytes
                Coprocessor Model        : 0x01
                Coprocessor Model Ext    : 0x00
                Coprocessor Type         : 0x00
                Coprocessor Family       : 0x0b
                Coprocessor Family Ext   : 0x00
                Coprocessor Stepping     : C0
                Board SKU                : C0PRQ-3120/3140 P/A


                ECC Mode                 : Enabled
                SMC HW Revision          : Product 300W Active CS

        Cores
                Total No of Active Cores : 57
                Voltage                  : 0 uV
                Frequency                : 1100000 kHz

        Thermal
                Fan Speed Control        : On
                Fan RPM                  : 900
                Fan PWM                  : 20
                Die Temp                 : 53 C

        GDDR
                GDDR Vendor              : Elpida
                GDDR Version             : 0x1
                GDDR Density             : 2048 Mb
                GDDR Size                : 5952 MB
                GDDR Technology          : GDDR5 
                GDDR Speed               : 5.000000 GT/s

 GDDR Frequency           : 2500000 kHz

                GDDR Voltage             : 1501000 uV

miccheck gives:

Copyright 2013 Intel Corporation All Rights Reserved

Executing default tests for host
  Test 0: Check number of devices the OS sees in the system ... pass
  Test 1: Check mic driver is loaded ... pass
  Test 2: Check number of devices driver sees in the system ... pass
  Test 3: Check mpssd daemon is running ... pass
Executing default tests for device: 0
  Test 4 (mic0): Check device is in online state and its postcode is FF ... pass
  Test 5 (mic0): Check ras daemon is available in device ... pass
  Test 6 (mic0): Check running flash version is correct ... pass

Status: OK

 

 

0 Kudos
Sait_U_
Beginner
2,281 Views

Progress report and problens/observations:

I have installed mpss-3.2 rpms except the mpss-modules* as described above. I rebuil the
mpss-modules* rpms using the repository https://github.com/xdsopl/mpss-module with Linux
kernel 3.13.6 under Fedora 20.

The mic.ko module loads correctly and mpssd daemon can be started (via mpss script in
/etc/rc.d/init.d). The card boots. The flash was older so I updated it using micflash
and that worked. miccheck says all os OK (details of all output below).

Observations and/or problems:

1. mpssd modules is using 100% of one cpu. I have 24 cpu's on this system so it is not
   neccessarily a big load but I don't think it should be using this much cpu.
   
2. For some reason in one of the reboots the host lost its network connection. Looking
   into it I noticed that ifcfg-mic0 was in /etc/sysconfig/networking/profiles/default/
   together with ifcfg-em1, which is the host systems file. I removed the mic0 from here
   and rebooted. Then I started mic0 by hand via "ifup mic0" and worked. Since there is
   a file in /etc/sysconfig/networking/devices/ifcfg-mic0.
   
3. I see that the mpss start/stop is not very smooth and often requires the host system
   reboot. Also I had the feeling that the shutdown and restart of the card was causing
   system hangs, I decided to "chkconfig mpss off" and start/stop by hand.
   
4. lspci -vvv now gives the correct identification of the card:
   "04:00.0 Co-processor: Intel Corporation Xeon Phi coprocessor 3120 series (rev 20)"
    with all the other info.
   
The card has been up for two days now (without use) and I can ssh into mic0.


                            MICINFO
                
        System Info
                HOST OS                 : Linux
                OS Version              : 3.13.6-200.fc20.x86_64
                Driver Version          : 3.2-1
                MPSS Version            : 3.2

                Host Physical Memory    : 65921 MB

Device No: 0, Device Name: mic0

        Version
                Flash Version            : 2.1.02.0390
                SMC Firmware Version     : 1.16.5078
                SMC Boot Loader Version  : 1.8.4326
                uOS Version              : 2.6.38.8+mpss3.2
                Device Serial Number     : ADKC33000361

        Board
                Vendor ID                : 0x8086
                Device ID                : 0x225d
                Subsystem ID             : 0x3608
                Coprocessor Stepping ID  : 2
                PCIe Width               : x16
                PCIe Speed               : 5 GT/s
                PCIe Max payload size    : 256 bytes
                PCIe Max read req size   : 512 bytes
                Coprocessor Model        : 0x01
                Coprocessor Model Ext    : 0x00
                Coprocessor Type         : 0x00
                Coprocessor Family       : 0x0b
                Coprocessor Family Ext   : 0x00
                Coprocessor Stepping     : C0
                Board SKU                : C0PRQ-3120/3140 P/A
                ECC Mode                 : Enabled
                SMC HW Revision          : Product 300W Active CS

        Cores
                Total No of Active Cores : 57
                Voltage                  : 0 uV
                Frequency                : 1100000 kHz

        Thermal
                Fan Speed Control        : On
                Fan RPM                  : 900
                Fan PWM                  : 20
                Die Temp                 : 53 C

        GDDR
                GDDR Vendor              : Elpida
                GDDR Version             : 0x1
                GDDR Density             : 2048 Mb
                GDDR Size                : 5952 MB
                GDDR Technology          : GDDR5
                GDDR Speed               : 5.000000 GT/s
                GDDR Frequency           : 2500000 kHz
                GDDR Voltage             : 1501000 uV

 

=====================MICHECK========
Executing default tests for host
  Test 0: Check number of devices the OS sees in the system ... pass
  Test 1: Check mic driver is loaded ... pass
  Test 2: Check number of devices driver sees in the system ... pass
  Test 3: Check mpssd daemon is running ... pass
Executing default tests for device: 0
  Test 4 (mic0): Check device is in online state and its postcode is FF ... pass
  Test 5 (mic0): Check ras daemon is available in device ... pass
  Test 6 (mic0): Check running flash version is correct ... pass

Status: OK
===============================

 

0 Kudos
Sait_U_
Beginner
2,281 Views

 

Forgot to mention that I blacklisted the mic_host and mic_card modules that are in the linux 3.13 kernel since they load up themselves and may cause interference.

0 Kudos
Sait_U_
Beginner
2,281 Views

Thanks for the tip. Unfortunately I have to stick with Fedora 20 since we have a lot of workstations in my research group and other software packages that I build from them, I think the card is working ok though. I am a bit busy next few weeks so I wan't be able to do too much testing. 

The problem with network connection is not due to mic0. This machine comes with two nic's and somehow NetworkManager is messing up configuration on startup sometimes. My biggest worry is the fan noise from the host machine,. Xeon Phi fan is quiet and turns off when the card is not being used. Is it normal for the host fan to go high when the card is slightly used? Thanks.

0 Kudos
TaylorIoTKidd
New Contributor I
2,281 Views

This has come up a few times. The answer is that it is normal for the fan to go high.

Regards
---
Taylor
 

0 Kudos
Sait_U_
Beginner
2,281 Views

Went through the same exercise for MPSS 3.2.1 (using the Ahmet Inan's metatag for 3.2.1) and everything is working again. Can execute test program on MiC.

The only problem again is the 100% cpu usage of mpssd daemon. Is this unique to me? Can someone check this? I can kill the mpssd daemon (not stop but kill the process) .But it would be nice to have a resolution to this.

0 Kudos
Reply