Software Archive
Read-only legacy content
17061 Discussions

Xeon Phi not quite working

Emo_T_
Beginner
472 Views

I am trying to install a Xeon Phi 3120A in a custom machine with ASUS P9X79-E WS motherboard (compatible with the Xeon Phi according to ASUS), i7-3930K processor and 1000W power supply. I am having similar issues with both Windows 7 (beta drivers) and CentOS 6.4 (latest drivers as well as N-2 version). The machine boots and finds the Xeon Phi. The initial commands work, but everything afterwards causes the machine to freeze: the screen is still on, but no mouse, no keyboard, no Ctrl+Alt+Del. micctrl --initdefaults [Warning] mic0: User 'root' does not have either rsa or dsa keys created [Warning] mic0: User 'todorov' does not have either rsa or dsa keys created micctrl -s mic0: ready micinfo MicInfo Utility Log Created Sat Aug 3 05:46:49 2013 System Info HOST OS : Linux OS Version : 2.6.32-358.el6.x86_64 Driver Version : 6720-13 MPSS Version : 2.1.6720-13 Host Physical Memory : 32959 MB Device No: 0, Device Name: mic0 Version Flash Version : NotAvailable SMC Firmware Version : NotAvailable SMC Boot Loader Version : NotAvailable uOS Version : NotAvailable Device Serial Number : NotAvailable Board Vendor ID : 0x8086 Device ID : 0x225d Subsystem ID : 0x3608 Coprocessor Stepping ID : 2 PCIe Width : x16 PCIe Speed : 5 GT/s PCIe Max payload size : 256 bytes PCIe Max read req size : 512 bytes Coprocessor Model : 0x01 Coprocessor Model Ext : 0x00 Coprocessor Type : 0x00 Coprocessor Family : 0x0b Coprocessor Family Ext : 0x00 Coprocessor Stepping : C0 Board SKU : NotAvailable ECC Mode : NotAvailable SMC HW Revision : NotAvailable Cores Total No of Active Cores : NotAvailable Voltage : NotAvailable Frequency : NotAvailable Thermal Fan Speed Control : NotAvailable Fan RPM : NotAvailable Fan PWM : NotAvailable Die Temp : NotAvailable GDDR GDDR Vendor : NotAvailable GDDR Version : NotAvailable GDDR Density : NotAvailable GDDR Size : NotAvailable GDDR Technology : NotAvailable GDDR Speed : NotAvailable GDDR Frequency : NotAvailable GDDR Voltage : NotAvailable miccheck miccheck 2.1.6720-13, created 14:49:30 Apr 30 2013 Copyright 2011-2013 Intel Corporation All rights reserved Test 1 Ensure installation matches manifest : OK Test 2 Ensure host driver is loaded : OK Test 3 Ensure driver matches manifest : OK Test 4 Detect all listed devices : OK MIC 0 Test 1 Find the device : OK MIC 0 Test 2 Check the POST code via PCI : FAILED MIC 0 Test 2> Current POST code is 12 (not FF) for MIC 0 MIC 0 Test 3 Connect to the device : SKIPPED MIC 0 Test 3> Prerequisite 'Ensure the device is online' failed: MIC 0 Test 3> The device is not online MIC 0 Test 4 Check for normal mode : SKIPPED MIC 0 Test 4> Prerequisite 'Ensure the device is online' failed: MIC 0 Test 4> The device is not online MIC 0 Test 5 Check the POST code via SCIF : SKIPPED MIC 0 Test 5> Prerequisite 'Ensure the device is online' failed: MIC 0 Test 5> The device is not online MIC 0 Test 6 Send data to the device : SKIPPED MIC 0 Test 6> Prerequisite 'Check for normal mode' failed: MIC 0 Test 6> The device is not in normal mode MIC 0 Test 7 Compare the PCI configuration : OK MIC 0 Test 8 Ensure Flash version matches manifest : SKIPPED MIC 0 Test 8> Prerequisite 'Check for normal mode' failed: MIC 0 Test 8> The device is not in normal mode Status: The POST code was not "FF" micsmc: shows the gui and complains about errors. When I try to reconnect, it says: Warning: mic0: Device connection lost! Information: mic0: Device connection restored. Warning: mic0: Device connection lost! (and so on) The following commands freeze the system completely: micflash -getversion micflash -update service mpss start Any idea what is going on? Thanks, Emo Todorov

0 Kudos
4 Replies
Emo_T_
Beginner
472 Views

Sorry about the mess in the previous post, the newlines somehow became spaces... the original question is now attached as a text file.

Another thought: I read somewhere that if anything goes wrong with flashing the BIOS, the device may become unusable.  Does anyone know what exactly "unusable" means, and in particular could my device be in this state?  I never got to a point where I could actually flash the BIOS (my entire system freezes when I try to run the micflash program), but perhaps it came in this state...  although a well-written Intel driver should not freeze the host just because a PCIe device is misbehaving :)

0 Kudos
Emo_T_
Beginner
472 Views

Here is one more hypothesis.  I am using a motherboard that supports both Xeon and i7, but my processor is i7.  Could it be that some of the MPSS software is specifically compiled for Xeon and uses instructions that do not exist on i7... are there such instructions anyway?

0 Kudos
Emo_T_
Beginner
472 Views

Problem solved!  I found another thread on this forum where someone had the same problem, and the solution was to downgrade the BIOS.  For the record, Xeon Phi 3120A works with ASUS P9X79-E WS, BIOS 0211, i7-3930K, CentOS 6.4 (original kernel), latest MPSS (no recompile).

0 Kudos
Vitaly_V__K_
Beginner
472 Views

miccheck works normally only when MIC in booting state.

You'll have to execute "micctrl -b" and then "micctrl -s" will show "online" state.

0 Kudos
Reply