Software Archive
Read-only legacy content
17061 Discussions

Offload problem

André_P_1
Beginner
773 Views

Hi

I'm working on a fairly complex application and I set out to try the MIC by offloading only a small part of the code to it. I ran into some dependecy issues, as the runtime required that I had a version of a library I use compiled for MIC, even though the offloaded code only needs simple math operations performed individually on elements of arrays. I setted the MIC_LD_LIBRARY_PATH to point to it (event though it is a static library...). It should copy the lib to the mic, however I still get this error:

On the sink, dlopen() returned NULL. The result of dlerror() is "/tmp/coi_procs/1/7127/load_lib/icpcoutbzHVYr: undefined symbol: _ZN20TLorentzVectorWFlagsC1ERKS_"
On the remote process, dlopen() failed. The error message sent back from the sink is /tmp/coi_procs/1/7127/load_lib/icpcoutbzHVYr: undefined symbol: _ZN20TLorentzVectorWFlagsC1ERKS_
offload error: cannot load library to the device 0 (error code 20)

The TLorentzVectorWFlags is a class, which inherits from a class of another library, of the said library, and it is not used in the offloaded code. Note that the library was built using the comand xiar -qoffload-build, ensuring that it was created a version for the host .

0 Kudos
5 Replies
Kevin_D_Intel
Employee
773 Views

When building your app, try adding the following option to the final link step: -offload-option,mic,ld,"--no-undefined"

Hopefully this will expose the origin of the unresolved reference in the offload image and that might help understand how to address the issue.

0 Kudos
André_P_1
Beginner
773 Views

That actually helped alot, the compilation now gives an error of unresolved dependencies. I now know where the problem is. However I now get this error:

offload error: cannot start process on the device 0 (error code 9)

Can't find what it means, the OFFLOAD_REPORT data only states [Offload] [HOST]  [State]   Unregister data tables...

0 Kudos
Kevin_D_Intel
Employee
773 Views

Feels like an offload initialization failure.

So you changed something and now receive that error?

Any chance you altered MIC_LD_LIBRARY_PATH and removed paths like these:

MIC_LD_LIBRARY_PATH=/opt/intel/composer_xe_2013.5.192/compiler/lib/mic:/opt/intel/mic/coi/device-linux-release/lib:/opt/intel/mic/myo/lib

These three are essential. (The compiler specific version you have may be differ from the 5.192 I'm showing).

0 Kudos
André_P_1
Beginner
773 Views

Kevin Davis (Intel) wrote:

Feels like an offload initialization failure.

So you changed something and now receive that error?

Any chance you altered MIC_LD_LIBRARY_PATH and removed paths like these:

MIC_LD_LIBRARY_PATH=/opt/intel/composer_xe_2013.5.192/compiler/lib/mic:/opt/intel/mic/coi/device-linux-release/lib:/opt/intel/mic/myo/lib

These three are essential. (The compiler specific version you have may be differ from the 5.192 I'm showing).

I have exactly the same, except I'm using the 4.183 version. Do these error codes have a description somewhere? I can't find it...

0 Kudos
Kevin_D_Intel
Employee
773 Views

The offload error message continues to be a work in progress area. There is no single collection at present.

The text accompanying the error offers some explanation as derived from COI result codes whose descriptions can be found in /opt/intel/mic/coi/include/common/COIResult_common.h.

Development was unable to explain the error code 9 much beyond that a possible COI timeout occurred somewhere inside COI as per the COI_TIME_OUT_REACHED error code (COIResult_common.h) because for offload no timeouts are set when starting the target process.

0 Kudos
Reply