Hi Andrea,

Andrea_D_1 · ‎09-16-2015

Hello,

we are having an issue compiling c++ code using the thread_local feature from c++11 standard for non-POD types. The same code compiles without problems for the host architecture, but it cannot compile for MIC architecture.

These are the relevant lines from a unit test that will not compile for MIC:

// [...]
  defaults& theDefaults() {
    static thread_local A anA;
    static thread_local B aB;
    static thread_local defaults theDefaults(anA,aB);
    return theDefaults;
  }
// [...]

While for the host architecture the code compiles, for mic (e.g. adding -mmic) the compilation fails. A warning is emitted during compilation of the unit:

warning #1897: thread-local variable cannot be dynamically initialized

And at link-time the following error is emitted:

testme.cc:(.text+0x2af): undefined reference to `__cxa_thread_atexit'

Is there a solution to this problem? What is the origin of it?

A complete unit test that shows the problem can be found here and is attached to this thread:

https://github .com/andreadotti/thread_local-Test-ICC

Thank you,

Andrea Dotti

KitturGanesh · ‎09-17-2015

Hi Andrea,
Yes, I could reproduce the issue. The thread_local support is only in GCC version 4.8 and above. So, the mpss stack that I'd on the system only has gcc 4.4 and hence the error was expected. I need to check the latest version of mpss and what version of gcc is supported in it and get back to you, thanks.

_Kittur

KitturGanesh · ‎09-18-2015

Hi Andrea,
I confirmed and verified that the latest MPSS release (3.5) doesn't contain the GCC version necessary to compile your application containing thread_local since GCC 4.8 and above supports it. The next version of MPSS which is 3.6 is in the pipeline and I'll let you know as soon as that's released (should be released around first week of Oct) and should contain GCC 5.1 which should resolve your issue. Appreciate your patience till then and I'll keep you updated.

_Kittur

Andrea_D_1 · ‎09-18-2015

Many thanks for the feedback. We'll test the new mpss when available. In the meantime a possible workaround exists:

// [...]
	  defaults& theDefaults() {
#ifdef __MIC__
	    static thread_local A* anA = nullptr;
	    static thread_local B* aB = nullptr;
	    static thread_local defaults* theDefaults = nullptr;
            if ( anA == nullptr ) {
                  anA = new A;
                  aB = new B;
                  theDefaults = new defaults(*anA,*aB);
            }
	    return *theDefaults;
#else
	    static thread_local A anA;
	    static thread_local B aB;
	    static thread_local defaults theDefaults(anA,aB);
	    return theDefaults;
#endif
	  }
// [...]

This is sub-optimal since it creates a memory-leak at exit, but at least seems to make our code compile correctly.

Andrea

KitturGanesh · ‎09-18-2015

Great, thanks much Andrea and I'll pass that as well through the issue filed with the developers. Again, will update you as soon as the new version is released.
_Kittur

KitturGanesh · ‎10-06-2015

Hi Andrea,
MPSS3.6 is now available and contains GCC 5.1 The prominent features of MPSS 3.6 are described in details in the following blog:

https://software.intel.com/en-us/blogs/2015/10/05/prominent-features-of-the-intel-manycore-platform-software-stack-intel-mpss-version

You can try the above and see if the issue gets resolved, appreciate much.

_Kittur

Andrea_D_1 · ‎10-30-2015

Dear Kittur,
thank you for your help.
I've installed on my box mpss 3.6 and made some testing. Unfortunately I've found other issues for which I would really appreciate your feedback. On my linux box I've using the latest ICC (16.0.0.109) I can demonstrate the issue with few lines of code:

#include <iostream>
#include <vector>

int main(int,char**) {
   std::vector<double> vv;
   vv.assign(5,1); // <<<<< Changing 1 to 1. solves the issue, why "ints" are mis-interpreted as iterators?
   for ( auto e : vv ) {
       std::cout<<e<<std::endl;
   }
   return 0;
}

While this code compiles fine for the host with icc and various flavors of gcc (4.9.2 and 5.1); the compilation will fail for the mic (after upgrade to mpss 3.6):

icpc -mmic -std=c++11 test.cc
/usr/linux-k1om-4.7/linux-k1om/../x86_64-k1om-linux/include/c++/4.7.0/bits/stl_iterator_base_types.h(154): error: name followed by "::" must be a class or namespace name
        typedef typename _Iterator::iterator_category iterator_category;
[...]

As additional information, the same code will fail to compile also for the host if I setup gcc 5.1 before trying compilation with icc:

source /opt/gcc-5.1/setup.sh #Setup gcc 5.1
icpc -E -x c++ -v - </dev/null
icpc version 16.0.0 (gcc version 5.1.0 compatibility)     <<<< PLEASE NOTE THIS LINE >>>>>>
[...]
icpc -std=c++11 test.cc
<<< SAME ERROR AS FOR THE MIC >>>

his second observation, together with the fact that mpss has 5.1 layer suggests me that the problem could be with the compatibility layer towards recent gcc 5.1.
Do you have any idea about this?

Thank you very much,
Andrea

KitturGanesh · ‎10-30-2015

Thanks Andrea, I'll look into this and get back to you after some investigation. Appreciate your patience, as always.
_Kittur

KitturGanesh · ‎10-30-2015

Hi Andrea,
Thanks for filing the issue which is a bug and has been filed with the developers and will keep you updated as soon as the release with the fix is out :-( Again, thanks for your patience through this.
_Regards,
Kittur

Andrea_D_1 · ‎10-30-2015

Dear Kittur,

thank you very much.

Andrea

KitturGanesh · ‎11-02-2015

Hi Andrea,
Thanks for your patience. I guess in the meantime, you can try this workaround for the source code you noted: “Changing 1 to 1. solves the issue…”
with the latest 16.0 release. Also, I'll keep you updated as soon as the release with the fix for this issue is out, thanks.
_Kittur

Andrea_D_1 · ‎11-02-2015

Hi Kittur,

For this particular example I can do that, but please note that the code I posted was only a simple case to show the problem. Unfortunately our code base is quite large (2M LOCs) and I am afraid it is not feasible for me to go through all the code right now. We have a workaround for the moment to avoid c++11 code on the MIC (using mpss 3.4 and adding #ifdef __MIC__ here and there), so I'll wait for the bug-fix.

Thank you very mush for your feedback, really appreciated.

Andrea

KitturGanesh · ‎11-02-2015

I completely agree, Andrea. Just wanted to mention the workaround for that code snippet per-se. Sure, I'll let you know as soon as the release is out with the fix for the issue. Appreciate your patience as always.

KitturGanesh · ‎11-03-2015

Hi Andrea,
Upon investigation have some feedback.

The problem related to the test case is that there are many cases in the standard library where there's an ambiguity of function overloads. In this case (vector assign) there are these two overloads:

void assign(size_type __n, const value_type& __val)

   template<typename _InputIterator>
   void assign(_InputIterator __first, _InputIterator __last)

So in the example vector::value_type is a double and the code calls assign() with two integers so the form of assign which takes a range of iterators (the second one) is actually a better match. The recent versions of the GNU headers use a new c++11 rule (related to SFINAE) to make the right call be chosen without issuing an error and this hasn’t been implemented in the Intel compiler but is on the high priority list to be implemented soon.

Workarounds:
1) As mentioned earlier, you would need to either change the literal or cast the second argument to match exactly to the container's value_type (this is suggested in the comment, use 1.0 which is of type double versus using 1 which is of type int).

2) The only other option is to specify -D__cplusplus=199711L but this will turn off all c++11 functionality in the GNU headers

BTW, the fix for this issue should be in a future 16.0 update release after the next one due to the code freeze that's already taken place. I'll keep you updated as soon as the release with the fix is out. Again, appreciate your patience till then.

_Kittur

Bug/Limitation: Cannot compile thread_local dynamic non-POD object for MIC architecture (c++11)