- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am using a Xeon Phi 3120A.
It is working but with an issue that according to the documentation can lead to erratic behavior. Currently I am using mpss3.1 on RHEL 6.4.
When I run micinfo, following is the output:
MicInfo Utility Log
Created Wed Nov 13 16:01:03 2013
System Info
HOST OS : Linux
OS Version : 2.6.32-358.23.2.el6.x86_64
Driver Version : 3.1-0.1.build0
MPSS Version : 3.1
Host Physical Memory : 32826 MB
Device No: 0, Device Name: mic0
Version
Flash Version : NotAvailable
SMC Firmware Version : NotAvailable
SMC Boot Loader Version : NotAvailable
uOS Version : NotAvailable
Device Serial Number : NotAvailable
Board
Vendor ID : 0x8086
Device ID : 0x225d
Subsystem ID : 0x3608
Coprocessor Stepping ID : 2
PCIe Width : x8
PCIe Speed : 5 GT/s
PCIe Max payload size : 256 bytes
PCIe Max read req size : 512 bytes
Coprocessor Model : 0x01
Coprocessor Model Ext : 0x00
Coprocessor Type : 0x00
Coprocessor Family : 0x0b
Coprocessor Family Ext : 0x00
Coprocessor Stepping : C0
Board SKU : C0 QS-3120 P/A
ECC Mode : NotAvailable
SMC HW Revision : NotAvailable
Cores
Total No of Active Cores : 57
Voltage : 0 uV
Frequency : 1100000 kHz
Thermal
Fan Speed Control : NotAvailable
Fan RPM : NotAvailable
Fan PWM : NotAvailable
Die Temp : NotAvailable
GDDR
GDDR Vendor : Elpida
GDDR Version : 0x1
GDDR Density : 2048 Mb
GDDR Size : 5952 MB
GDDR Technology : GDDR5
GDDR Speed : 5.000000 GT/s
GDDR Frequency : 2500000 kHz
GDDR Voltage : 0 uV
When I updated the Flash and SMC, following was the output:
$ /usr/bin/micflash -update -device all
No image path specified - Searching: /usr/share/mpss/flash
mic0: Flash image: /usr/share/mpss/flash/EXT_HP2_C0_0386-03.rom.smc
mic0: Flash update started
mic0: Flash update done
mic0: SMC update started
micflash: mic0: SMC update failed: SMC buffer size exceeded (0x1)
mic0: Transitioning to ready state
Please restart host for flash changes to take effect
Please help.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can't speak as to why this appears only in the MPSS 3.1 windows-readme.pdf (available on the Intel® Manycore Platform Software Stack (MPSS) page), but that very error is mentioned at the bottom of page 4 (in step 4 of section 2.2.3 Update the Flash) and the suggestion is to proceed with the other steps in that section.
You should at least first reboot your host, then boot your card (if you had not previously set it up to boot at host reboot time) and then use micinfo to check all the flash/SMC levels.
Here are the Flash/SMC versions for MPSS 3.1 under a Linux system:
System Info
HOST OS : Linux
OS Version : 2.6.32-358.el6.x86_64
Driver Version : 3.1-0.1.build0
MPSS Version : 3.1
Host Physical Memory : 5979 MB
Device No: 0, Device Name: mic0
Version
Flash Version : 2.1.03.0386
SMC Firmware Version : 1.15.4830
SMC Boot Loader Version : 1.8.4326
uOS Version : 2.6.38.8+mpss3.1
If Flash/SMC versions on your card are not at expected levels, then you could try the flash step again (as per the Linux readme-en.txt), or try updating the Bootloader flash only as per the specifics in the windows-readme.pdf starting at step 6 of those instructions. If they are at expected levels then the card should be good to go.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Kevin,
Thank you very much for your reply. I just tried to follow the steps from that file.
In step 7, I get the following error.
$ micflash -update /usr/share/mpss/flash/EXT_HP2_SMC_Bootloader_1_8_4326.css_ab -device all
mic0: Flash image: /usr/share/mpss/flash/EXT_HP2_SMC_Bootloader_1_8_4326.css_ab
mic0: SMC update started
micflash: mic0: SMC update not permitted: Unknown error (0x0)
mic0: Transitioning to ready state
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you operating under the root account or at least a sudo shell?
So you rebooted the host first and checked the Firmware/SMC levels? and the result was still the same as before?
What happens when you:
1. Reset the card (micctrl -r)
2. Verify the card is "ready" (micctrl -s)
3. Boot the card (micctrl -b)
4. micinfo
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just ran into a similar failed flash on Windows and once I was able to boot the cards the Firmware/SMC versions all reported NotAvailable.
If you have not had any success yet, you might try:
1. Shutdown the host and physically power cycle the host
2. Reboot the host, and reset the Xeon Phi card.
3. Ensure the card is in the "ready" state and then repeat the original micflash (as root): /usr/bin/micflash -update -device all
When I repeated the flash both the Flash and SMC succeeded where previously the SMC flash had failed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Kevin,
Yes, I was on a root shell.
I tried all the steps from your last two comments. Still shows the same errors.
One thing I probably should have mentioned. Before using mpss-3.1, I used mpss-2.1 .
But even then I couldn't see the Flash/SMC version.
I never was able to see the Flash/SMC version since I purchased the card.
Could it be a faulty card? In that case, what should I do?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One thing I noticed in your original post is that you have a C0 stepping. For C0 cards, you do not update the bootloader as a separate step. That is why it was saying "SMC update not permitted: Unknown error (0x0)". Instead you should run the first command you tried: "/usr/bin/micflash -update -device all" twice. It is expected that you will get the message "No image path specified - Searching: /usr/share/mpss/flash". That is telling you that you did not explicitly specify a flash file, so it is going to look in the default location for the correct file - which the next line shows it did find. It is normally OK that you get the message "SMC update failed: SMC buffer size exceeded (0x1)". I don't know why but this is a known bug and you get this even when the update finished successfully.
The micinfo command will not show you the flash version unless the card is booted - i.e. if 'micctrl -s' shows online. If you can't see the information then, there is a problem.
So make sure the card is in the ready state: "micctrl -rw"
Try doing "/usr/bin/micflash -update -device all" twice in a row
Reboot the host
Check to see if the card is online: "micctrl -w" and if not boot it
Then rerun micinfo.
If it still will not give you the flash information, with the card online and you running as root, there is a problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Frances,
Thank you for your reply.
I just ran all the things in the order you mentioned. But still Flash/SMC versions are not available.
So there must be a problem with the card. Should I contact Intel support to replace the card?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As a software person, I really, really, really would have liked to find a nice, simple software solution, And maybe there is and I just don't see it. But given that this has been an ongoing problem, maybe it is time to talk to support and see what they say.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page