Software Archive
Read-only legacy content
17060 Discussions

OFFLOAD_DEVICES broke

jimdempseyatthecove
Honored Contributor III
1,071 Views

Windows 7 x64 Pro, MS VS 2013, Intel Parallel Studio XE 2015 update 4, MPSS 3.4.3

Building the MIC sample LEO_tutorial as Release x64, and "Start Without Debugging"

Without OFFLOAD_DEVICES environment variable set, runs OK

With OFFLOAD_DEVICES=0 environment variable set, runs OK

(I have 2 5110P's)

With OFFLOAD_DEVICES=0,1 environment variable set, hangs 60 seconds, reports error
With OFFLOAD_DEVICES=1 environment variable set, hangs 60 seconds, reports error

Jim Dempsey

0 Kudos
6 Replies
jimdempseyatthecove
Honored Contributor III
1,071 Views

micsmc shows both devices.

the error message is: offload error: cannot get device handle 1 (error code 1)

Also the call to OFFLOAD_NUMBER_OF_DEVICES() performs a similar 60 second hang (when OFFLOAD_DEVICES is 0,1 or 1)

Jim Dempsey

0 Kudos
Frances_R_Intel
Employee
1,071 Views

I'm not sure what's up. It knows the device is there and it is trying to use it. The errno = 1 should be EPERM - Operation not permitted. At first I thought maybe it was coming from scif but I don't think any of the scif routines return EPERM. My cube mate mentioned a problem he had a while back where he was getting that error, both when he tried to use offload and when he tried to ssh into the coprocessor. In that case, it turned out that the coprocessor was trying to automount a file when he logged in and that the mount is what was causing the EPERM.

Let me start running through some things. You have probably already checked them all but maybe a brain dump will jog something.

In micsmc, you checked under the advanced button for the error logs? Also, in the Windows device manager, you checked to be sure the two devices matched, except for address, and the event lists had nothing weird in them? You ran micinfo to make sure there was nothing funky about the configuration of one of the cards? And you installed something like PuTTY and ssh'ed directly into both cards?

Being thorough you have probably also tried pulling the card that is currently mic0 and then seeing the other card would work if it thought itself to be mic0.

If I think of anything more, I will let you know.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,071 Views

Francis,

Thanks for your help. System-wise (hardware) they should be ok. I can use both/either when I boot to Linux. It is in Windows that this appears. BTW the Windows system just installed the MPSS 3.4.3, the Linux is earlier (3.2.3 I think).

I haven't tried sshing into the MIC... let me try...

nuts.. have to regen ssh keys and accounts. Misplaced Windows MIC password :s

Interesting thing is offload to mic 0 works and I can shutdown and restart each mic individually, micsmc sees both. So I think it is a software/configuration issue.

When updating MPSS, both units re-flashed OK. Hmm, haven't rebooted Linux since, don't know if the flash update to 3.4.3 will interfere with Linux installed 3.2.3.

The getting the second mic running, for now is a lower priority item. I have another post running on the mic forum relating to problems using the mic within a C# applicaton calling a C# assembly, calling a C++ DLL (managed) containing Fortran object files (static library) containing offload regions. Crashes application upon entry to offload. If you have ideas, please reply to other thread. I think the trick may be:

C# applicaton calling a C# assembly, calling a C++ DLL (managed) containing Fortran object files (static library) calling an unmanaged DLL (assuming that can be done).

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,071 Views

I forgot to add...

In an attempt to diagnose this issue I tried setting Offload_Report(3), but this being code without a console window, if anything came out, it went into the bit bucket. Explicitly opening a file on unit 0 and 6, to a known location, with buffered='no' creates an empty file. So either nothing with PRINTed or WRITE'd to the output, or it did but did not Flush after each write. I get null length files.

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,071 Views

Francis,

I am revisiting this thread because I am now trying to get multiple MICs to run on Windows (7 Pro x64).

I've updated MPSS to January's version.

micctrl shows both cards booting, starting, running ok.

micsmc shows both cards running

ping 192.168.1.100 works to mic0
ping 192.168.2.100 works to mic1

However,

PuTTY works to mic0 (ip address), but gets "connection refused" when attempting to go to mic1

If I set environment variable OFFLOAD_DEVICES=0,1
And set OFFLOAD_REPORT=(1,2 or 3)

I get a device 1 error.

Setting OFFLOAD_DEVICES=0

I can get valid OFFLOAD_REPORT(s)

And I've tried reloading the ssh keys.

Is there something else I need to do to get offload access to mic1 (as well as the occasional PuTTY access)?

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,071 Views

Additional info.

https://software.intel.com/sites/default/files/managed/bd/53/System_Administration_Guide_Intel(R)XeonPhi(TM)Coprocessor.pdf

page 12 of the Linux mpss states:

 

It may be necessary to disable certain security features before installing the MPSS. Failing to do this can result in unexplained ‘Connection refused’ messages when you attempt to log into the coprocessor.

Is there a similar thing on Windows installations? (If so, please give details other than a flat out "yes").
E.g. firewall settings to permit the 2 branch of 192,168.2.100

Jim Dempsey

0 Kudos
Reply