Software Archive
Read-only legacy content
17061 Discussions

Error codes for xeon phi

Xiaoxin_T_
Beginner
1,083 Views

I meet the following error when running some sample codes: "offload error: cannot start process on the device 0 (error code 9)"

Does anyone know how to solve this problem? Is there a way to find out the meaning of different error codes?

 

0 Kudos
7 Replies
Loc_N_Intel
Employee
1,083 Views

Hi Xiaoxin,

It seems like to need to start the MPSS stack before you can run your offload code:

% service mpss start

For the meaning of all error codes, let me do some research and get back to you. Thanks.

0 Kudos
Loc_N_Intel
Employee
1,083 Views

I find the List of Run-Time Error Messages here http://software.intel.com/en-us/node/464618  . In the above case, it seems like  your coprocessor is not available yet, you need to bring it up. Thank you.

0 Kudos
Xiaoxin_T_
Beginner
1,083 Views

Thanks for your response :-)

But both "service mpss start" and "micctrl -w" show that the device is online. Does that mean the service has been started?

loc-nguyen (Intel) wrote:

Hi Xiaoxin,

It seems like to need to start the MPSS stack before you can run your offload code:

% service mpss start

For the meaning of all error codes, let me do some research and get back to you. Thanks.

0 Kudos
Loc_N_Intel
Employee
1,083 Views

Hi Xiaoxin,

That is interesting. Could you tell me the MPSS version, the compiler version and where you get the sample code? I will reproduce the problem. Thank you. 

0 Kudos
Xiaoxin_T_
Beginner
1,083 Views

[xiaoxin.tang@compute-phi-d1-002-p ~]$ micinfo
MicInfo Utility Log

Created Wed Nov 6 14:34:21 2013


System Info
HOST OS : Linux
OS Version : 2.6.32-279.el6.x86_64
Driver Version : 6720-15
MPSS Version : 2.1.6720-15
Host Physical Memory : 132132 MB

Device No: 0, Device Name: mic0

Version
Flash Version : 2.1.03.0386
SMC Firmware Version : 1.15.4830
SMC Boot Loader Version : 1.8.4326
uOS Version : 2.6.38.8-g2593b11
Device Serial Number : ADKC24200590

Board
Vendor ID : 0x8086
Device ID : 0x2250
Subsystem ID : 0x2500
Coprocessor Stepping ID : 3
PCIe Width : Insufficient Privileges
PCIe Speed : Insufficient Privileges
PCIe Max payload size : Insufficient Privileges
PCIe Max read req size : Insufficient Privileges
Coprocessor Model : 0x01
Coprocessor Model Ext : 0x00
Coprocessor Type : 0x00
Coprocessor Family : 0x0b
Coprocessor Family Ext : 0x00
Coprocessor Stepping : B1
Board SKU : B1PRQ-5110P
ECC Mode : Enabled
SMC HW Revision : Product 225W Passive CS

Cores
Total No of Active Cores : 60
Voltage : 1010000 uV
Frequency : 1052631 kHz

Thermal
Fan Speed Control : N/A
Fan RPM : N/A
Fan PWM : N/A
Die Temp : 61 C

GDDR
GDDR Vendor : Elpida
GDDR Version : 0x1
GDDR Density : 2048 Mb
GDDR Size : 7936 MB
GDDR Technology : GDDR5
GDDR Speed : 5.000000 GT/s
GDDR Frequency : 2500000 kHz
GDDR Voltage : 1501000 uV

[xiaoxin.tang@compute-phi-d1-002-p ~]$ icc --version
icc (ICC) 13.0.1 20121010
Copyright (C) 1985-2012 Intel Corporation. All rights reserved.

The sample codes are contained in the compiler (Samples/en_US/C++/mic_samples). All of them failed with the same problem.

Regards

Xiaoxin

loc-nguyen (Intel) wrote:

Hi Xiaoxin,

That is interesting. Could you tell me the MPSS version, the compiler version and where you get the sample code? I will reproduce the problem. Thank you. 

0 Kudos
Loc_N_Intel
Employee
1,083 Views

Your MPSS 2.1 is current but your compiler is a bit old. I compiled and ran all the MIC sample code in ../Samples/en_US/C++/mic_samples without any problem. I am not sure what causes the COI timeout in your system.

The information about PCIe in your system (shown in micinfo) is  not usual:

PCIe Width : Insufficient Privileges
PCIe Speed : Insufficient Privileges
PCIe Max payload size : Insufficient Privileges
PCIe Max read req size : Insufficient Privileges

- Can you try to reflash your coprocessor? (micflash)
- If possible, get and install the new version of the Intel Composer 2013 SP1

By the way, I learned that the error codes are found in /opt/intel/mic/coi/include/COIResult_common.h (for MPSS 2.1) and in /usr/include/intel-coi/common/COIResult_common.h (for MPSS 3.1), but not sure how this COI timeout (COI_TIME_OUT_REACHED) problem occured. 

0 Kudos
Xiaoxin_T_
Beginner
1,083 Views

Thanks loc-nguyen :-)

I met the problem when I was using a public machine. After building the Xeon Phi environment and installing the latest MPSS on my own machine, the problem never appeaers again.

loc-nguyen (Intel) wrote:

Your MPSS 2.1 is current but your compiler is a bit old. I compiled and ran all the MIC sample code in ../Samples/en_US/C++/mic_samples without any problem. I am not sure what causes the COI timeout in your system.

The information about PCIe in your system (shown in micinfo) is  not usual:

PCIe Width : Insufficient Privileges
PCIe Speed : Insufficient Privileges
PCIe Max payload size : Insufficient Privileges
PCIe Max read req size : Insufficient Privileges

- Can you try to reflash your coprocessor? (micflash)
- If possible, get and install the new version of the Intel Composer 2013 SP1

By the way, I learned that the error codes are found in /opt/intel/mic/coi/include/COIResult_common.h (for MPSS 2.1) and in /usr/include/intel-coi/common/COIResult_common.h (for MPSS 3.1), but not sure how this COI timeout (COI_TIME_OUT_REACHED) problem occured. 

0 Kudos
Reply