Software Archive
Read-only legacy content
17060 Discussions

Could not figure if MIC drivers are properly installed

kinnair__andy
Beginner
1,861 Views

Hi,

 I have installed Intel® C++ Studio XE 2013 for Linux and MPSS for Red Hat version 6.3 on a Centos 6.4 operating system. When I ran the miccheck file I got the following error:miccheck 2.1.6720-13, created 18:06:51 Apr 30 2013
Copyright 2011-2013 Intel Corporation  All rights reserved

Test 1 Ensure installation matches manifest : FAILED
Test 2 Ensure host driver is loaded         : OK
Test 3 Ensure driver matches manifest       : FAILED
Test 3> Driver version mismatch: Manifest='', Live='6720-13'
Test 4 Detect all listed devices            : OK
MIC 0 Test 1 Find the device                       : OK
MIC 0 Test 2 Check the POST code via PCI           : OK
MIC 0 Test 3 Connect to the device                 : OK
MIC 0 Test 4 Check for normal mode                 : OK
MIC 0 Test 5 Check the POST code via SCIF          : OK
MIC 0 Test 6 Send data to the device               : OK
MIC 0 Test 7 Compare the PCI configuration         : OK
MIC 0 Test 8 Ensure Flash version matches manifest : FAILED
MIC 0 Test 8> Flash version mismatch. Manifest: , Running: 2.1.02.0386
Status: Test failed

How do I ensure that the installation is matching the manifest? How can I be sure that my mic is working properly?

Thanks in advance

0 Kudos
17 Replies
kinnair__andy
Beginner
1,861 Views

Hi I am sorry the Centos version is 6.3

0 Kudos
James_T_Intel
Moderator
1,861 Views

Hi Anirudha,

I'm going to move this thread to the Many Integrated Core Architecture forum.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

0 Kudos
TimP
Honored Contributor III
1,861 Views

In the mpss distribution, docs/readme... (same text as the public internet readme) there is a section about the flash updating.  You must do service mpss stop before the flash update (that's mentioned somewhere else in the readme) before looking for ready responses from micctrl.

In my experience, the coprocessor could run without the update.   

Various other issues can enter, such as the IP addresses changing without warning subsequent to mpss installation.

Luckily, the internet posted documents on KNC have begun to appear on web searches like "yourtopic site:software.intel.com"

0 Kudos
Frances_R_Intel
Employee
1,861 Views

As Tim says, the complete directions for installing the MPSS are in the readme file in the doc directory of your MPSS files. (You can also find a link to it on http://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss.) Unfortunately, the directions need to take into account a number of possible contingencies which can make them difficult to follow.

I know it is a hassle but I would like to suggest you do a whole reinstall of the MPSS. It shouldn't be necessary to re-do the Composer as well.

For the current release: First do a complete uninstall of the MPSS. (section 2.3 of the readme) Then reinstall the MPSS (section 2.2). Do not skip the initdefaults step! After step 3, before you go on to step 4, do 'service mpss stop'. Then execute step 4. (which sends you to section 7.2) When you are following the directions in section 7.2, be careful to check the ready state of the coprocessor before each flash --update command, be sure to reboot each time it says to and be sure you execute both flash update commands (once for the flash itself and once for the boot loader.) If you decide to reinstall the compilers and other tools, rerun the 'micctrl --resetconfig' command (not always necessary but it can head off problems with VTune if you have installed it.) Finally 'service mpss start' and see if things are ok.

0 Kudos
kinnair__andy
Beginner
1,861 Views

I did exactly as it was suggested...while updating the flash as per section 7.2 of readme I got the following error:

[root@hpc31 mpss_gold_update_3]# /opt/intel/mic/bin/micflash -update -device all
No image path specified - Searching: /opt/intel/mic/flash
mic0: Flash image: /opt/intel/mic/flash/EXT_HP2_B0_0386-02.rom.smc
micflash: mic0: Failed to switch to maintenance mode: write: /sys/class/mic/mic0/state: Input/output error

I had taken care that I had executed service mpss stop and had also taken care that mic was ready. Also I realised that I had overlooked this the last time I had tried the installation. So this might be the cause of the entire issue.





0 Kudos
Frances_R_Intel
Employee
1,861 Views

Can you reboot the host and try again? If you have the mpss service set to start when the host reboots, disable that for now. I want to see if a host reboot will clean up the state and allow you to proceed. 

0 Kudos
Frances_R_Intel
Employee
1,861 Views

I meant try again from the flash update command, not to start over from the beginning. (If you thought I meant that, you were probably moaning to yourself in pain.)

0 Kudos
kinnair__andy
Beginner
1,861 Views

You mean I will do chkconfig mpss off and then reboot the host system and then try updating flash again right?

0 Kudos
kinnair__andy
Beginner
1,861 Views

I disabled the MPSS service to start with a boot by : chkconfig mpss off
Then rebooted the host system, then checked if mic was 'ready' and then ran the flash update command....
However, the same error persists.

0 Kudos
kinnair__andy
Beginner
1,861 Views

I ran the miccheck and got the following failure:

[root@hpc31 ~]# /opt/intel/mic/bin/miccheck

miccheck 2.1.6720-13, created 14:49:30 Apr 30 2013
Copyright 2011-2013 Intel Corporation  All rights reserved

Test 1 Ensure installation matches manifest : OK
Test 2 Ensure host driver is loaded         : OK
Test 3 Ensure driver matches manifest       : OK
Test 4 Detect all listed devices            : OK
MIC 0 Test 1 Find the device                       : OK
MIC 0 Test 2 Check the POST code via PCI           : FAILED
MIC 0 Test 2> Current POST code is 40 (not FF) for MIC 0
MIC 0 Test 3 Connect to the device                 : SKIPPED
MIC 0 Test 3> Prerequisite 'Ensure the device is online' failed:
MIC 0 Test 3>  The device is not online
MIC 0 Test 4 Check for normal mode                 : SKIPPED
MIC 0 Test 4> Prerequisite 'Ensure the device is online' failed:
MIC 0 Test 4>  The device is not online
MIC 0 Test 5 Check the POST code via SCIF          : SKIPPED
MIC 0 Test 5> Prerequisite 'Ensure the device is online' failed:
MIC 0 Test 5>  The device is not online
MIC 0 Test 6 Send data to the device               : SKIPPED
MIC 0 Test 6> Prerequisite 'Check for normal mode' failed:
MIC 0 Test 6>  The device is not in normal mode
MIC 0 Test 7 Compare the PCI configuration         : OK
MIC 0 Test 8 Ensure Flash version matches manifest : SKIPPED
MIC 0 Test 8> Prerequisite 'Check for normal mode' failed:
MIC 0 Test 8>  The device is not in normal mode
Status: The POST code was not "FF"

Would the POST issue be related somehow to the maintenance mode problem? If yes what do I need to do to shoot it?

0 Kudos
Frances_R_Intel
Employee
1,861 Views

The POST code of 40 means "Begin Coprocessor OS authentication"; the POST code of FF would have meant "Bootstrap finished execution". A POST code of 40 makes sense since the bootloader couldn't be updated. I need to talk to some people and get back to you.

0 Kudos
kinnair__andy
Beginner
1,861 Views

I was going through some of the threads on this forums and I realized that the latest version of MPSS and host OS is preferred. So, I have updated both of them to MPSS 6.4 and Cent OS 6.4. However, I am facing the same issue of the maintenance mode while flashing the mic.

0 Kudos
Andrey_Vladimirov
New Contributor III
1,861 Views

Hi Anirudha,

was this issue resolved? We are having a similar issue (different output in newer MPSS).

Andrey

0 Kudos
Frances_R_Intel
Employee
1,861 Views

Andrey,

You are getting POST code 40? Is that what you meant by having a similar issue? Were you upgrading the MPSS at the time? Is this associated with any other issues or problems you have been having?

Frances

0 Kudos
Andrey_Vladimirov
New Contributor III
1,861 Views

The common thing that we see is the message

"micflash: mic0: Failed to switch to maintenance mode: write: /sys/class/mic/mic0/state: Input/output error"

However, deeper digging showed that this may be a different issue. Our card fails with post code F2 after attempting operations at post code 30. So it does not even get to post code 40. I will continue troubleshooting on IPS.

 

0 Kudos
Frances_R_Intel
Employee
1,861 Views

Are you installing a new MPSS on a previous system or is this a completely new system? I am wondering if there is actually a hardware problem.

There is a troubleshooting chart at https://software.intel.com/en-us/forums/topic/393956 but it doesn't seem to have any information about GDDR training problems which is what your post codes are showing:

"30" Begin memory training 

 

"F2" GDDR failed memory training 

I am going to need to get some guidance and get back to you.

 

0 Kudos
Andrey_Vladimirov
New Contributor III
1,861 Views

Thank you, Frances! I can bet my goldfish that this is a hardware problem. If you have access, please see details in issue 6000051526 on Premier.

0 Kudos
Reply