Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

thread_local compiler bug

Thiago_I_
Beginner
567 Views

Using "Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 16.0 Build 20160415" on Windows will cause the following simple code to crash pretty much every time.  Is this a compiler bug?  Anyway around this now or will I have to wait for icc 17 and if so, how long will that take?

// icl local_thread.cpp
#include <thread>
#include "stdio.h"

struct ThreadObj
{
	void get_tls()
	{
		thread_local int tid;
		printf("&tid %p\n", &tid);
		printf("tid  %d\n", tid);
	}
};

void ThreadFunc(ThreadObj* t)
{
	t->get_tls();
}

int main()
{
	const int num_threads = 256;
	std::thread threads[num_threads];

	for (int i = 0; i < num_threads; ++i)
		threads = std::thread(ThreadFunc, new ThreadObj());

	for (int i = 0; i < num_threads; ++i)
		threads.join();

	return 0;
}

 

0 Kudos
5 Replies
Melanie_B_Intel
Employee
567 Views

Yes I can reproduce this problem and I opened DPD200414181 in our internal bug tracking database.  Thank you for the report and the nice test case.  

I can't comment on when the fix will be available. 

0 Kudos
jimdempseyatthecove
Honored Contributor III
567 Views

Wouldn't this be the correct way:

// icl local_thread.cpp
#include <thread>
#include "stdio.h"

struct ThreadObj
{
 thread_local static int tid = 0xBAADBEEF; // indicate uninitialized
 void get_tls()
 {
  if(tid == 0xBAADBEEF)
   tid = thread::get_id(); // once-only
  printf("&tid %p\n", &tid);
  printf("tid  %d\n", tid);
 }
};

void ThreadFunc(ThreadObj* t)
{
 t->get_tls();
}

int main()
{
 const int num_threads = 256;
 std::thread threads[num_threads];

 for (int i = 0; i < num_threads; ++i)
  threads = std::thread(ThreadFunc, new ThreadObj());

 for (int i = 0; i < num_threads; ++i)
  threads.join();

 return 0;
}

Note, do not place the get_id in the ctor, as this will get you the id of the thread performing the new.

Jim Dempsey

0 Kudos
Thiago_I_
Beginner
567 Views

Thanks for the suggestion, but that doesn't seem to be valid C++11 code.  I don't currently have access right now to icc with which to test this, but both gcc and clang will complain:

6 : error: non-const static data member must be initialized out of line
thread_local static int tid = 0xBAADBEEF; // indicate uninitialized
^ ~~~~~~~~~~

6 : error: ISO C++ forbids in-class initialization of non-const static member 'ThreadObj::tid'
thread_local static int tid = 0xBAADBEEF; // indicate uninitialized
^~~~~~~~~~

Also, the example I posted has been overly simplified from what I'm actually trying to do.  The real use I had involves the thread_local being a struct with a constructor and destructor. This is then used to let me get a unique thread ID that is between 0 and num_threads.  As threads are created and destroyed, the constructor and destructor keep things in sync so that the active threads can always get an available ID within this range, assuming of course that there are fewer than num_threads threads at any time.

0 Kudos
jimdempseyatthecove
Honored Contributor III
567 Views

The trick I use for this is to declare a static int initialized to 0. Then use XCHGADD (InterlockedIncrement, etc...)

Note, this is a reason why one should be able to initialize a non-const static member variable. How many times has a programmer used code like this. Barring that you can expose the sequence number in global scope.

Members of the "committee" weren't thinking when the added this restriction IMHO.

Jim Dempsey

0 Kudos
Thiago_I_
Beginner
567 Views

I'm not sure I follow.  Suppose I create 64 threads.  Each of these will atomically increment the static int and get a unique ID from 0 to 63. So far so good.  But now suppose the thread with ID 10 is destroyed and a new thread is created.  How do I know that ID 10 is now free and can be assigned to this new thread?

Short of requiring threads to register and deregister themselves, which I don't want to require, I don't see any portable way other than with thread_local.

0 Kudos
Reply