Software Archive
Read-only legacy content
17061 Discussions

Xeon Phi 3120A boot aborted - no configuration file present

Munasinghe__Indula
688 Views

Hi everyone,

I'm using a system with xeon phi 3120A, and motherboard is Asus Z370-P. The board supports above 4G encoding. And I've limited the PCIe speed to Gen2 and the card works with Windows 10 that I've installed on the same system without any problem (mic0 goes online with startup and I can ssh in to it). But what I want is to work with CentOS. That's where the device failed to work. Here's what I did,

I've installed CentOS 7.3 and copied the contents of mpss-3.8.3-linux.tar to folder in t he path /home/my_name and the cd in to it and cd into modules folder and installed mpss-modules and mpss-modules-dev files for the correct linux version (mpss-modules-3.10.0-514.el7.x86_64-3.8.3-1.x86_64.rpm and mpss-modules-dev-3.10.0-514.el7.x86_64-3.8.3-1.x86_64.rpm). Then came back to mpss-3.8.3 folder to install all the other rpm files in it, which are as follows,

glibc2.12pkg-libmicaccesssdk0-3.8.3-1.glibc2.12.x86_64.rpm
glibc2.12pkg-libmicaccesssdk-dev-3.8.3-1.glibc2.12.x86_64.rpm
glibc2.12pkg-libmicmgmt0-3.8.3-1.glibc2.12.x86_64.rpm
glibc2.12pkg-libmicmgmt-dev-3.8.3-1.glibc2.12.x86_64.rpm
glibc2.12pkg-libmicmgmt-doc-3.8.3-1.glibc2.12.x86_64.rpm
glibc2.12pkg-libodmdebug0-3.8.3-1.glibc2.12.x86_64.rpm
glibc2.12pkg-libodmdebug-dev-3.8.3-1.glibc2.12.x86_64.rpm
glibc2.12pkg-libsettings0-3.8.3-1.glibc2.12.x86_64.rpm
glibc2.12pkg-libsettings-dev-3.8.3-1.glibc2.12.x86_64.rpm
glibc2.12pkg-mpss-flash-3.8.3-1.glibc2.12.x86_64.rpm
glibc2.12pkg-mpss-memdiag-kernel-3.8.3-1.glibc2.12.x86_64.rpm
glibc2.12pkg-mpss-rasmm-kernel-3.8.3-1.glibc2.12.x86_64.rpm
intel-composerxe-compat-k1om-3.8.3-1.x86_64.rpm
libscif0-3.8.3-1.glibc2.12.x86_64.rpm
libscif-dev-3.8.3-1.glibc2.12.x86_64.rpm
libscif-doc-3.8.3-1.glibc2.12.x86_64.rpm
mpss-boot-files-3.8.3-1.glibc2.12.x86_64.rpm
mpss-coi-3.8.3-1.glibc2.12.x86_64.rpm
mpss-coi-dev-3.8.3-1.glibc2.12.x86_64.rpm
mpss-coi-doc-3.8.3-1.glibc2.12.x86_64.rpm
mpss-core-3.8.3-1.glibc2.12.x86_64.rpm
mpss-core-dev-3.8.3-1.glibc2.12.x86_64.rpm
mpss-daemon-3.8.3-1.glibc2.12.x86_64.rpm
mpss-daemon-dev-3.8.3-1.glibc2.12.x86_64.rpm
mpss-eclipse-cdt-mpm-3.8.3-1.glibc2.12.x86_64.rpm
mpss-license-3.8.3-1.glibc2.12.x86_64.rpm
mpss-miccheck-3.8.3-1.glibc2.12.x86_64.rpm
mpss-miccheck-bin-3.8.3-1.glibc2.12.x86_64.rpm
mpss-micmgmt-3.8.3-1.glibc2.12.x86_64.rpm
mpss-micmgmt-doc-3.8.3-1.glibc2.12.x86_64.rpm
mpss-micmgmt-python-3.8.3-1.glibc2.12.x86_64.rpm
mpss-micsmc-gui-3.8.3-1.glibc2.12.x86_64.rpm
mpss-modules-headers-3.8.3-1.glibc2.12.x86_64.rpm
mpss-mpm-3.8.3-1.glibc2.12.x86_64.rpm
mpss-mpm-doc-3.8.3-1.glibc2.12.x86_64.rpm
mpss-myo-3.8.3-1.glibc2.12.x86_64.rpm
mpss-myo-dev-3.8.3-1.glibc2.12.x86_64.rpm
mpss-myo-doc-3.8.3-1.glibc2.12.x86_64.rpm
mpss-offload-3.8.3-1.glibc2.12.x86_64.rpm
mpss-offload-dev-3.8.3-1.glibc2.12.x86_64.rpm
mpss-sciftutorials-3.8.3-1.glibc2.12.x86_64.rpm
mpss-sciftutorials-doc-3.8.3-1.glibc2.12.x86_64.rpm
mpss-sdk-k1om-3.8.3-1.x86_64.rpm
mpss-sysmgmt-micdiagnostic-3.8.3-1.glibc2.12.x86_64.rpm
mpss-sysmgmt-micras-3.8.3-1.glibc2.12.x86_64.rpm
mpss-sysmgmt-python-3.8.3-1.glibc2.12.x86_64.rpm
netperf-2.6.0-r0.glibc2.12.x86_64.rpm
netperf-doc-2.6.0-r0.glibc2.12.x86_64.rpm

Then performed following codes in terminal, (correct me if I'm wrong with the assumptions I've made)

[root@localhost indula]# micctrl -s
mic0: ready

Hence I deduced that mic is okay. it's post code 12. according to the mpss manual it seems okay to be in this state.

updated the micflash and checked with micflash -getversion,

[root@localhost mpss-3.8.3]# micflash -getversion
mic0: Flash read started
mic0: Read done
mic0: Version: 2.1.02.0391
mic0: Transitioning to ready state

Then I enabled the mpss service,

[root@localhost indula]# systemctl enable mpss

then I tried to start the mpss service.

[root@localhost indula]# systemctl start mpss
Job for mpss.service failed because the control process exited with error code. See "systemctl status mpss.service" and "journalctl -xe" for details.

I get the above error. I examined the systemctl status mpss.service and journalctl -xe. Following are the results that I get,

[root@localhost mpss-3.8.3]# systemctl status mpss.service -l
● mpss.service - Intel(R) MPSS control service
   Loaded: loaded (/etc/systemd/system/mpss.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2018-10-31 23:57:30 EDT; 31min ago
  Process: 3647 ExecStart=/etc/init.d/mpss start (code=exited, status=6)

Oct 31 23:57:30 localhost.localdomain systemd[1]: Starting Intel(R) MPSS control service...
Oct 31 23:57:30 localhost.localdomain mpss[3647]: Loading MIC module: [  OK  ]
Oct 31 23:57:30 localhost.localdomain systemd[1]: mpss.service: control process exited, code=exited status=6
Oct 31 23:57:30 localhost.localdomain systemd[1]: Failed to start Intel(R) MPSS control service.
Oct 31 23:57:30 localhost.localdomain systemd[1]: Unit mpss.service entered failed state.
Oct 31 23:57:30 localhost.localdomain systemd[1]: mpss.service failed.

Then the journalctl -xe (I've copied only the places where mpss service failure is described)

Oct 31 23:57:30 localhost.localdomain systemd[1]: Starting Intel(R) MPSS control
-- Subject: Unit mpss.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit mpss.service has begun starting up.
Oct 31 23:57:30 localhost.localdomain mpss[3647]: Loading MIC module: [  OK  ]
Oct 31 23:57:30 localhost.localdomain systemd[1]: mpss.service: control process 
Oct 31 23:57:30 localhost.localdomain systemd[1]: Failed to start Intel(R) MPSS 
-- Subject: Unit mpss.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
lines 2934-2956/3128 95%
Oct 31 23:56:45 localhost.localdomain pulseaudio[3031]: [pulseaudio] bluez5-util.c: GetManagedObjects() failed: 
Oct 31 23:57:11 localhost.localdomain realmd[2739]: quitting realmd service after timeout
Oct 31 23:57:11 localhost.localdomain realmd[2739]: stopping service
Oct 31 23:57:12 localhost.localdomain polkitd[782]: Registered Authentication Agent for unix-process:3605:11281 
Oct 31 23:57:12 localhost.localdomain systemd[1]: Configuration file /etc/systemd/system/mpss.service is marked 
Oct 31 23:57:12 localhost.localdomain systemd[1]: Reloading.
Oct 31 23:57:12 localhost.localdomain systemd-sysv-generator[3623]: Configuration file /etc/systemd/system/mpss.
Oct 31 23:57:12 localhost.localdomain systemd[1]: Configuration file /etc/systemd/system/mpss.service is marked 
Oct 31 23:57:12 localhost.localdomain polkitd[782]: Unregistered Authentication Agent for unix-process:3605:1128
Oct 31 23:57:13 localhost.localdomain chronyd[755]: Selected source 220.247.242.85
Oct 31 23:57:30 localhost.localdomain polkitd[782]: Registered Authentication Agent for unix-process:3640:13150 
Oct 31 23:57:30 localhost.localdomain systemd[1]: Starting Intel(R) MPSS control service...
-- Subject: Unit mpss.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit mpss.service has begun starting up.
Oct 31 23:57:30 localhost.localdomain mpss[3647]: Loading MIC module: [  OK  ]
Oct 31 23:57:30 localhost.localdomain systemd[1]: mpss.service: control process exited, code=exited status=6
Oct 31 23:57:30 localhost.localdomain systemd[1]: Failed to start Intel(R) MPSS control service.
-- Subject: Unit mpss.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit mpss.service has failed.
-- 
-- The result is failed.
Oct 31 23:57:30 localhost.localdomain systemd[1]: Unit mpss.service entered failed state.
-- Subject: Unit mpss.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit mpss.service has failed.
-- 
-- The result is failed.
Oct 31 23:57:30 localhost.localdomain systemd[1]: Unit mpss.service entered failed state.
Oct 31 23:57:30 localhost.localdomain systemd[1]: mpss.service failed.
Oct 31 23:57:30 localhost.localdomain polkitd[782]: Unregistered Authentication Agent for unix-process:3640:1315
Oct 31 23:57:37 localhost.localdomain sudo[3673]:     root : TTY=pts/0 ; PWD=/home/indula ; USER=root ; COMMAND=
Oct 31 23:58:46 localhost.localdomain firefox.desktop[3737]: ATTENTION: default value of option force_s3tc_enabl
Nov 01 00:00:00 localhost.localdomain gnome-session[2803]: (evolution-alarm-notify:3236): evolution-alarm-notify
Nov 01 00:00:01 localhost.localdomain systemd[1]: Created slice user-0.slice.
Oct 31 23:56:45 localhost.localdomain dbus-daemon[756]: dbus[756]: [system] Failed to activate service 'org.blue
Oct 31 23:56:45 localhost.localdomain dbus[756]: [system] Failed to activate service 'org.bluez': timed out
Oct 31 23:56:45 localhost.localdomain pulseaudio[3031]: [pulseaudio] bluez5-util.c: GetManagedObjects() failed: 
Oct 31 23:57:11 localhost.localdomain realmd[2739]: quitting realmd service after timeout
Oct 31 23:57:11 localhost.localdomain realmd[2739]: stopping service
Oct 31 23:57:12 localhost.localdomain polkitd[782]: Registered Authentication Agent for unix-process:3605:11281 
Oct 31 23:57:12 localhost.localdomain systemd[1]: Configuration file /etc/systemd/system/mpss.service is marked 
Oct 31 23:57:12 localhost.localdomain systemd[1]: Reloading.
Oct 31 23:57:12 localhost.localdomain systemd-sysv-generator[3623]: Configuration file /etc/systemd/system/mpss.
Oct 31 23:57:12 localhost.localdomain systemd[1]: Configuration file /etc/systemd/system/mpss.service is marked 
Oct 31 23:57:12 localhost.localdomain polkitd[782]: Unregistered Authentication Agent for unix-process:3605:1128
Oct 31 23:57:13 localhost.localdomain chronyd[755]: Selected source 220.247.242.85
Oct 31 23:57:30 localhost.localdomain polkitd[782]: Registered Authentication Agent for unix-process:3640:13150 
Oct 31 23:57:30 localhost.localdomain systemd[1]: Starting Intel(R) MPSS control service...
-- Subject: Unit mpss.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit mpss.service has begun starting up.
Oct 31 23:57:30 localhost.localdomain mpss[3647]: Loading MIC module: [  OK  ]
Oct 31 23:57:30 localhost.localdomain systemd[1]: mpss.service: control process exited, code=exited status=6
Oct 31 23:57:30 localhost.localdomain systemd[1]: Failed to start Intel(R) MPSS control service.
-- Subject: Unit mpss.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit mpss.service has failed.
-- 
-- The result is failed.
Oct 31 23:57:30 localhost.localdomain systemd[1]: Unit mpss.service entered failed state.
Oct 31 23:57:30 localhost.localdomain systemd[1]: mpss.service failed.
Oct 31 23:57:30 localhost.localdomain polkitd[782]: Unregistered Authentication Agent for unix-process:3640:1315
Oct 31 23:57:37 localhost.localdomain sudo[3673]:     root : TTY=pts/0 ; PWD=/home/indula ; USER=root ; COMMAND=
Oct 31 23:58:46 localhost.localdomain firefox.desktop[3737]: ATTENTION: default value of option force_s3tc_enabl
Nov 01 00:00:00 localhost.localdomain gnome-session[2803]: (evolution-alarm-notify:3236): evolution-alarm-notify
Nov 01 00:00:01 localhost.localdomain systemd[1]: Created slice user-0.slice.

Then I did following execution and suddenly it seems mpss running, (I don't know the reason or meaning of it)

[root@localhost indula]# sudo mpssd

then checked the mpss status,

[root@localhost indula]# service mpss status
mpss is running

then I tried to stop the mpss service so I can go with systemctl to start it and see if it works,

[root@localhost mpss-3.8.3]# service mpss stop
Stopping mpss (via systemctl):                             [  OK  ]

Then I ran mpss status to see if it really stopped,

[root@localhost mpss-3.8.3]# service mpss status
mpss is running

it seems it's still running. Don't know why.

Hence mpss is runnig, I tried to boot the mic0 and got following error as I depicted in the topic of the problem,

[root@localhost indula]# micctrl -b
  [Error] mic0: Boot aborted - no configuation file present

I also tried to reset it,

[root@localhost mpss-3.8.3]# micctrl -rw
  [Error] mic0 Reset failed - card state ready
          mic0: ready

Then I performed following tests,

miccheck

[root@localhost mpss-3.8.3]# miccheck
MicCheck 3.8.4-1
Copyright (c) 2016, Intel Corporation.

Executing default tests for host
  Test 0: Check number of devices the OS sees in the system ... pass
  Test 1: Check mic driver is loaded ... pass
  Test 2: Check number of devices driver sees in the system ... pass
  Test 3: Check mpssd daemon is running ... pass
Executing default tests for device: 0
  Test 4 (mic0): Check device is in online state and its postcode is FF ... fail
    device is not online: ready

Status: FAIL
Failure: A device test failed

micinfo doesn't seems to be sohwing anything. I think it's becuase the device is not online.

[root@localhost mpss-3.8.3]# micinfo
MicInfo Utility Log
Created Thu Nov  1 00:48:02 2018


	System Info
		HOST OS			: Linux
		OS Version		: 3.10.0-514.el7.x86_64
		Driver Version		: 3.8.4-1
		MPSS Version		: 3.8.4

		Host Physical Memory	: 3649 MB

Device No: 0, Device Name: mic0

	Version
		Flash Version 		 : NotAvailable
		SMC Firmware Version	 : NotAvailable
		SMC Boot Loader Version	 : NotAvailable
		Coprocessor OS Version 	 : NotAvailable
		Device Serial Number 	 : NotAvailable

	Board
		Vendor ID 		 : 0x8086
		Device ID 		 : 0x225d
		Subsystem ID 		 : 0x3608
		Coprocessor Stepping ID	 : 2
		PCIe Width 		 : x16
		PCIe Speed 		 : 5 GT/s
		PCIe Max payload size	 : 256 bytes
		PCIe Max read req size	 : 512 bytes
		Coprocessor Model	 : 0x01
		Coprocessor Model Ext	 : 0x00
		Coprocessor Type	 : 0x00
		Coprocessor Family	 : 0x0b
		Coprocessor Family Ext	 : 0x00
		Coprocessor Stepping 	 : C0
		Board SKU 		 : C0PRQ-3120/3140 P/A
		ECC Mode 		 : NotAvailable
		SMC HW Revision 	 : NotAvailable

	Cores
		Total No of Active Cores : NotAvailable
		Voltage 		 : NotAvailable
		Frequency 		 : NotAvailable

	Thermal
		Fan Speed Control 	 : NotAvailable
		Fan RPM 		 : NotAvailable
		Fan PWM 		 : NotAvailable
		Die Temp		 : NotAvailable

	GDDR
		GDDR Vendor		 : NotAvailable
		GDDR Version		 : NotAvailable
		GDDR Density		 : NotAvailable
		GDDR Size		 : NotAvailable
		GDDR Technology		 : NotAvailable
		GDDR Speed		 : NotAvailable
		GDDR Frequency		 : NotAvailable
		GDDR Voltage		 : NotAvailable

then I gave the lspci command to see if system recognize it. the results are as follows.

[root@localhost mpss-3.8.3]# lspci | grep -i Co-processor
01:00.0 Co-processor: Intel Corporation Xeon Phi coprocessor 3120 series (rev 20)
[root@localhost mpss-3.8.3]# lspci -s 01:00.0 -vv
01:00.0 Co-processor: Intel Corporation Xeon Phi coprocessor 3120 series (rev 20)
	Subsystem: Intel Corporation Device 3608
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at 2c00000000 (64-bit, prefetchable) [size=8G]
	Region 4: Memory at f7100000 (64-bit, non-prefetchable) [size=128K]
	Capabilities: [44] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [4c] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 75.000W
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <4us, L1 unlimited
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [88] MSI: Enable- Count=1/16 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [98] MSI-X: Enable+ Count=16 Masked-
		Vector table: BAR=4 offset=00017000
		PBA: BAR=4 offset=00018000
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
	Kernel driver in use: mic

Also I tried the same procedure with mpss-3.8.4 with CentOS 7.3

I do not know what to do next. Can you please help me solve this problem ?

0 Kudos
1 Reply
Munasinghe__Indula
688 Views

Hi everyone, 

Eventually I found a solution!

The problem was the xeon phi was not getting the initial configuration files to boot itself up. One of such files is default.conf and it's common to all co-processors. Those files can be created using micctrl tool if they do not already exist. The command micctrl --initdefaults creates those MPSS specific configuration files. Therefore it's essential to run the micctrl --initdefaults command before the mpssd daemon starts. 

The error previously occurred when service mpss start  executed is also corrected after the initialization of configuration files. But sudo mpssd also can start the mpss service. Therefore before running the sudo mpssd, executing the micctrl --initdefaults solves the above problem. 

you can read more about configuration files in the mpss_users_guide / Configuring and booting the coprocessor OS

Here's what I did ultimately. 

[root@localhost indula]# service mpss status
mpss is stopped

[root@localhost indula]# miccheck
MicCheck 3.8.3-1
Copyright (c) 2016, Intel Corporation.

Executing default tests for host
  Test 0: Check number of devices the OS sees in the system ... pass
  Test 1: Check mic driver is loaded ... pass
  Test 2: Check number of devices driver sees in the system ... pass
  Test 3: Check mpssd daemon is running ... fail
    mpssd daemon not running

Status: FAIL
Failure: mpssd daemon not running

[root@localhost indula]# micctrl --initdefaults

[root@localhost indula]# micctrl -s
mic0: ready

[root@localhost indula]# micctrl -b
  [Error] Cannot boot cards - mpssd daemon is not running

[root@localhost indula]# sudo mpssd

[root@localhost indula]# miccheck
MicCheck 3.8.3-1
Copyright (c) 2016, Intel Corporation.

Executing default tests for host
  Test 0: Check number of devices the OS sees in the system ... pass
  Test 1: Check mic driver is loaded ... pass
  Test 2: Check number of devices driver sees in the system ... pass
  Test 3: Check mpssd daemon is running ... pass
Executing default tests for device: 0
  Test 4 (mic0): Check device is in online state and its postcode is FF ... fail
    device is not online: booting

Status: FAIL
Failure: A device test failed

[root@localhost indula]# micctrl -b
          mic0: booting /usr/share/mpss/boot/bzImage-knightscorner
  [Error] mic0 Boot failed - card state booting

[root@localhost indula]# micctrl -s
mic0: online (mode: linux image: /usr/share/mpss/boot/bzImage-knightscorner)

[root@localhost indula]# micinfo
MicInfo Utility Log
Created Thu Nov  1 07:23:02 2018


    System Info
        HOST OS            : Linux
        OS Version        : 3.10.0-514.el7.x86_64
        Driver Version        : 3.8.3-1
        MPSS Version        : 3.8.3

        Host Physical Memory    : 3649 MB

Device No: 0, Device Name: mic0

    Version
        Flash Version          : 2.1.02.0391
        SMC Firmware Version     : 1.17.6900
        SMC Boot Loader Version     : 1.8.4326
        Coprocessor OS Version      : 2.6.38.8+mpss3.8.3
        Device Serial Number      : ADKC51100256

    Board
        Vendor ID          : 0x8086
        Device ID          : 0x225d
        Subsystem ID          : 0x3608
        Coprocessor Stepping ID     : 2
        PCIe Width          : x16
        PCIe Speed          : 5 GT/s
        PCIe Max payload size     : 256 bytes
        PCIe Max read req size     : 512 bytes
        Coprocessor Model     : 0x01
        Coprocessor Model Ext     : 0x00
        Coprocessor Type     : 0x00
        Coprocessor Family     : 0x0b
        Coprocessor Family Ext     : 0x00
        Coprocessor Stepping      : C0
        Board SKU          : C0PRQ-3120/3140 P/A
        ECC Mode          : Enabled
        SMC HW Revision      : Product 300W Active CS

    Cores
        Total No of Active Cores : 57
        Voltage          : 0 uV
        Frequency         : 1100000 kHz

    Thermal
        Fan Speed Control      : On
        Fan RPM          : 1200
        Fan PWM          : 20
        Die Temp         : 53 C

    GDDR
        GDDR Vendor         : Elpida
        GDDR Version         : 0x1
        GDDR Density         : 2048 Mb
        GDDR Size         : 5952 MB
        GDDR Technology         : GDDR5 
        GDDR Speed         : 5.000000 GT/s 
        GDDR Frequency         : 2500000 kHz
        GDDR Voltage         : 1501000 uV
[root@localhost indula]# miccheck
MicCheck 3.8.3-1
Copyright (c) 2016, Intel Corporation.

Executing default tests for host
  Test 0: Check number of devices the OS sees in the system ... pass
  Test 1: Check mic driver is loaded ... pass
  Test 2: Check number of devices driver sees in the system ... pass
  Test 3: Check mpssd daemon is running ... pass
Executing default tests for device: 0
  Test 4 (mic0): Check device is in online state and its postcode is FF ... pass
  Test 5 (mic0): Check ras daemon is available in device ... pass
  Test 6 (mic0): Check running flash version is correct ... pass
  Test 7 (mic0): Check running SMC firmware version is correct ... pass

Status: OK

Then the micsmc-gui showed co-processor in live idling state. 

xeon_idling.png

 

0 Kudos
Reply