I am running Windows Server 2012 R2 (.NET Framework 4.5) and recently upgraded the MPSS drivers to 3.7.2 (performing also the required Flash update on the MI), and now the MIC does not appear to be mapped in the IP tables when I run "ipconfig /all". I also try pinging it on the default IP address (192.168.1.100) without success.
My problem is that I cannot ssh into the MIC.
After installing the drivers, I generated (micctrl -g) and checked the MIC configuration files "global.xml" and"mic0.xml" (attached as ".txt" files to this post), and they seem OK. I did not modify the files generated by MPSS.
I followed the installation procedure found here (download page for the MPSS driver tools).
The MIC itself seems to be working fine. I ran the miccheck and all test pass.
Also, I can see activity with micsmx-gui.
Furthermore, I can run some of the COI examples that come with the MPSS (inside the "sdk\tutorial\coi" directory) where COI launches the MIC side code from the host.
Also, the driver update as well as the flash and SMC update seem to have been done successfully (see attached log files showing the micinfo and miccheck outputs).
Nevertheless, when I look at the log for the MIC (micctrl -l mic0) there are some messages that seem troubling (see attached mic_log file).
Some of the messages follow:
<7>[ 17.940471] mic0: no IPv6 routers present
<7>[ 22.250346] mic0: no IPv6 routers present
I have tried this process many times, as well a regenerating the MIC configuration files with "micctrl -g" and then restarting the MIC.
Alternatively, I have also restarted the MIC as with "micctrl -[r & w & b]".
Neither has the desired effect.
Also, after installing the 3.7.2 drivers and flashing the MIC, I rebooted the host (as instructed by the MPSS document I referred above), before trying to reset the configuration.
I also tried reinstalling the MPSS drivers and reinstalling them several times, while removing manually the old configuration files generated in "C:\Program Files\Intel\MPSS".
This had no effect either.
Just to double check: after installing the drivers, flashing the MIC and rebooting the host, the NIC on the MIC should be automatically configured and appear maped to IP 192.168.1.100 right?
Of course assuming that one does not change the mic0.xml configuration file.
I would appreciate any other tests or procedures to follow towards solving this.
Yes, after you reboot the host, the mic0's IP address is set 10 192.168.1.100 by default. You should ping this IP address successfully.
What is the output from the command "ipconfig"? Before upgrading MPSS 3.7.2, did you saw this problem with the previous MPSS version?
We had installed MPSS version 3.5.2 and the MIC was correctly mapped onto the default address (192.168.1.100).
So when I pinged it it would immediately reply.
I could also SSH to it. This is how I ran most of the SCIF examples, which require me copying the Xeon Phi binary to the target MIC (along with the required libraries) and launching it there as well as the host counterpart.
This all used to work fine!
It is possible that mic0.xml and mic1.xml are somehow corrupted. To get around that, uninstall the MPSS 3.7.2, then navigate to the folder "c:\Program Files\Intel\MPSS" (assume that you installed MPSS in the default directory) and manually delete the files mic0.xml, mic1.xml. Reinstall MPSS 3.7.2 again and test if you can ping 192.168.1.100 .
I have already tried removing the driver and manually deleting the whole MPSS folder that contains the global.xml and mic0.xml files.
The files that I currently have generated contain the same information as the two (renamed .txt files) included at the beginning of my post. I do not see anything strange there, but please take a look.
The only thing I noticed in the "global.xml" file is in the first line, where the field standalone is set to yes as follows:
Does this refer to KNL standalone processor. I have a Xeon Phi 7120P Knights Corner (as shown in the mic_logs.txt file attached in my second posting here).
Changing this field to "no" and rebooting the MIC has no effect.
Just to emphasize, I have regenerated these files several times with the "micctrl -g" command and then rebooted the Xeon Phi. I also tried rebooting the host and still the card is not mapped into the default IP address, or any address at all for that matter.
The card was successfully mapped using the previous 3.5.2 drivers, but it was crashing the host often and randomly. The crashes have stopped with 3.7.2, but now the MIC card does is not mapped into the any IP.
I download your files (global.xml and mic0.xml), and replace them with mine. Note that my system runs Windows 2012 R2 and has MPSS 3.7.2 installed as well. I then reboot the coprocessor and bring the MPSS service up. Everything still works properly, so your global.xml and mic0.xml are fine.
What is the output when you run the commands "micinfo" and "micinfo -listDevices" ?
The outputs from your commands look fine.
You may look for any Windows Event Viewer logs while you bring the MPSS service down and then up. From your Windows search dialog, search for “Event Viewer”. In the “Event Viewer” windows, expand “Applications and Service Logs”, then click on “MPSSLog”. This shows the logs and errors recorded, click on the logs to see if they contain information about ID 12. Here is what I get about event ID 12 “IP Address 192.168.1.99 configured correctly for node 0”. This event tells me that the MIC was assigned IP address.
For your reference, I attach three information logs (IDs 32, 30 and 12) here. Please let me know if you see event ID 12 information log when you bring MPSS down and up.
Also, sometime just reboot your system and it will fix the problem.
I checked my Windows "Event Viewer" under the "Application and Services Logs" and under "MPSSLog".
From here, I ordered the logs by time and looked for ID 12.
There are several ID's 12, but they are all from before a installed the new drivers.
After the driver installation, I can see many IDs 30-31-32, but no ID 12.
This is a clear indication that the MIC never gets through the IP configuration, which validates my previous observations.
Now how should I proceed to fix it?
P.D. From the installation of the new drivers to the command issues checking the status of the MIC, I have used Administrative privileges on our system.
Please recall from my previous posts, that after I installed the drivers I booted the host several times. I even reinstalled the drivers a couple of times, none of these boots had any effect.
Also recall that I am able to run some COI examples in the SDK that comes along the drivers, but have no way to access the device through SSH, thereof I cannot run the SCIF examples that require me to copy and manually launch the programs from the MIC.