- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
After a system and compiler upgrade my previously working program now gives the following error during an offload:
offload error: cannot find offload entry __offload_entry_offload_handler_cpp_370transfer__8c912811342804b6ec8a993e510992a1 offload error: process on the device 0 unexpectedly exited with code 1
What is really strange is that this is not the first offload. In fact multiple similar offloads from the same file in the same library worked fine and this one is then giving the error every time. I checked the compiler opt-report for offload and everything looks normal, especially the offload of that region is mentioned. I than looked at the symbols in my shared library that should contain it and it is there:
$ objdump -t lib/libnormaliz.so | grep __offload_entry_offload_handler_cpp_370 000000000253cb50 l O .OffloadEntryTable. 0000000000000010 __offload_entry_offload_handler_cpp_370transfer__8c912811342804b6ec8a993e510992a1_$entry
I attached the log of a run with OFFLOAD_REPORT=3 494594
Our setup is: CentOS 7, MPSS 3.6.1, Intel Parallel Studio XE 2016 Cluster Edition for Linux Update 1
Does anybody have an idea what the problem could be? I can upload more information if it could be helpful.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The mismatch is most likely caused (indirectly) by a difference in gcc version on host and MIC.
Does the function containing the offload use parameters of types defined in gcc headers? If yes, try and change to types defined by yourself, rather than types inherited from included headers. If you can't do that, then you'll need to match up gcc versions, but try this first.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please try this:
Run "mic_extract <your_binary>" to extract the MIC code. it will appear on the file system as <your_binary>MIC.
Then run objdump to see the entry symbols in both binaries. See if they match.
objdump -t <your_binary> | grep '$entry'
objdump -t <your_binary>MIC | grep '$entry'
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I did as you said on my shared library which contains the offload segments. The entries match except the critical one:
form the lib:
000000000253cb50 l O .OffloadEntryTable. 0000000000000010 __offload_entry_offload_handler_cpp_370transfer__8c912811342804b6ec8a993e510992a1_$entry
from the mic_extract:
000000000072cb30 l O .OffloadEntryTable. 0000000000000010 __offload_entry_offload_handler_cpp_370transfer__6b65f83eb01cfeb45ccb4575c33520c8_$entry
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The mismatch is most likely caused (indirectly) by a difference in gcc version on host and MIC.
Does the function containing the offload use parameters of types defined in gcc headers? If yes, try and change to types defined by yourself, rather than types inherited from included headers. If you can't do that, then you'll need to match up gcc versions, but try this first.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The function has as argument a std:list< std:vector<int> >. I moved the offload section in a separate function, now it works. Thanks for this solution!
The proper fix would still be to get matching headers. Do you have an indication what to change for that?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Would it be possible for you to create a small reproducer for this situation?
Our Developers are interested in that for testing of a possible solutions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here is a minimal example of what I'm doing. The new method is my workaround and the old method is how I ran into the problem. When I run it on our server, I get:
New transfer method:
recieved 8 pyramids.
mic 0: transfered 8 pyramids.
Old transfer method:
offload error: cannot find offload entry __offload_entry_transfer_pyramids_cpp_103transfer__c4209102481176599a3116eb09cf2fbbicpc0101639380120Iu65al
offload error: process on the device 0 unexpectedly exited with code 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was advised you should be able to work around the issue by adding the option -D_GLIBCXX_USE_CXX11_ABI=0 when compiling.
Also, we should have a fix for this in our next PSXE 2016 Update 3 (16.0 compiler) release in a few months and our next major release due out later this year.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, that option is also working!

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page