- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have an OpenMP program written without GPU offload.
I am in the process of adapting it to use GPU offload, should a GPU be present.
I would also adapt it to use a combination of concurrent OpenMP on CPU and GPU.
The original code used thread local storage:
#if defined(__linux)
__thread int myCore = -1; // logical core (may be subset of physical cores and not necessarily core(0))
__thread int myHT = -1; // logical thread (may be subset of hw threads in core and not necessarily hwThread(0) in core)
#else
__declspec(thread) int myCore = -1; // logical core (may be subset of physical cores and not necessarily core(0))
__declspec(thread) int myHT = -1; // logical thread (may be subset of hw threads in core and not necessarily hwThread(0) in core)
#endifI now get an error:
1>C:\Diffusion\diffusion_tiled_HT1\HyperThreadPhalanx.h(83,19): : error : thread-local storage is not supported for the current target
1> 83 | extern __declspec(thread) int myCore;
While I can see that the CPU core and HT are meaningless within the GPU, they are significant within the CPU parallel regions.
Now I can circumvent the error by making an array for each variable and index it by the omp_get_thread_num(), but this isn't as efficient as using TLS.
Any comments on this would be welcomed.
Jim Dempsey
Link Copied
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page