Using a thread_local non-POD type seems to cause all static but non-thread_local variables (which should only be initialized once) to be reconstructed on every access to the thread_local variable.
Observe:
// icpc version 15.0.0 (gcc version 4.9.0 compatibility) // icpc -std=c++11 tls.cpp && ./a.out // -- Expected output -- // MakePointer() // MyInt constructor <pointer> // -- Actual output -- // MakePointer() // MyInt constructor <pointer> // MakePointer() x 10 (= numReps) #include <cstdio> // non-POD type struct MyInt { MyInt() { printf("MyInt constructor %p\n", this); } int v = 1; }; // should be called exactly once inline void *MakePointer() { printf("MakePointer()\n"); return nullptr; } // normal static initialization, should happen ONCE static void *pointer = MakePointer(); // thread_local non-POD variable, should be constructed as many times as there are threads static thread_local MyInt v1; const int numReps = 10; int main(int argc, char const *argv[]) { for (int i = 0; i < numReps; ++i) ++v1.v; // each access seems to reinitialize all static variables?!? return 0; }
We have a static variables 'pointer' which should be constructed exactly once (twice if you count zero-initialization of statics and globals) and a non-POD thread_local variables 'v1' which - as this example only used a single thread - be constructed once as well.
Unfortunately - at least using icpc version 15.0.0 (gcc version 4.9.0 compatibility) - every access to the thread_local variable 'v1' seems to cause reinitialization of 'pointer'. This is both a performance problem (depending on the cost of initialization of all statics in your translation unit) and a correctness problem (in my actual use case a static variable shared between threads got re-initialized by later threads while it was being used by earlier ones).
When I change the thread_local 'v1' to a POD-type (e.g. int), this phenomenon no longer seems to occur.
Link Copied
Hi Daniel,
Could you please upgrade your compiler? It works with latest 15.0 U2 release (and 15.0 U1 release). See below:
$ source /opt/intel/composer_xe_2015.0.090/bin/compilervars.sh intel64 $ icc temp.cpp -std=c++11 && ./a.out MakePointer() MyInt constructor 0x7f3c2fcd9718 MakePointer() MakePointer() MakePointer() MakePointer() MakePointer() MakePointer() MakePointer() MakePointer() MakePointer() MakePointer() $ source /opt/intel/composer_xe_2015.1.133/bin/compilervars.sh intel64 $ icc temp.cpp -std=c++11 && ./a.out MakePointer() MyInt constructor 0x7fae6756f718 $ source /opt/intel/composer_xe_2015.2.164/bin/compilervars.sh intel64 $ icc temp.cpp -std=c++11 && ./a.out MakePointer() MyInt constructor 0x7f1bb853a718 $
Thanks,
Shenghong
For more complete information about compiler optimizations, see our Optimization Notice.