- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I'm wondering if anyone has used gnu pthread_alloc (A thread-safe allocator that uses a different memory pool for each thread, according to SGI STL site) with stl containers on IA64-2.
Since most of the local data can be created within an OpenMP parallel region, I expect that an allocator that can ensure the data locality will greatly improve the performance.
Any comment or suggestion?
Is there an equivalent allocator intel compiler provides?
Or, can you recommend any STL compliant allocator that can do better that pthread_alloc?
Thanks.
I'm wondering if anyone has used gnu pthread_alloc (A thread-safe allocator that uses a different memory pool for each thread, according to SGI STL site) with stl containers on IA64-2.
Since most of the local data can be created within an OpenMP parallel region, I expect that an allocator that can ensure the data locality will greatly improve the performance.
Any comment or suggestion?
Is there an equivalent allocator intel compiler provides?
Or, can you recommend any STL compliant allocator that can do better that pthread_alloc?
Thanks.
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
According to my understanding, the OpenMP incorporated with Intel compilers employs the linux pthreads. It should accomplish the thread local data storage placement for static scheduled parallel regions, when the application is organized so that "first touch" occurs in such a parallel region. On Altix, this requires use of the dplace tool.
In real applications, clearly, a great deal remains to be accomplished in getting this all to work effectively, including efficient placement of shared regions.
The question might be more topical on the Intel Threading forum.
In real applications, clearly, a great deal remains to be accomplished in getting this all to work effectively, including efficient placement of shared regions.
The question might be more topical on the Intel Threading forum.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the information.
As you suggested the threading forum, do you recommend submitting questions about intel c++ compiler with openmp to the threading forum in general?
Can you kindly direct me to the threading forum? I see vtune-related discussions and I am not quite sure any question about c++/openmp can be answered without being asked to use vtune. Our system (SGI Altix, running 2.4 linux kernel) does not have vtune and I'm doubtful that the situation will change.
Thanks again.
As you suggested the threading forum, do you recommend submitting questions about intel c++ compiler with openmp to the threading forum in general?
Can you kindly direct me to the threading forum? I see vtune-related discussions and I am not quite sure any question about c++/openmp can be answered without being asked to use vtune. Our system (SGI Altix, running 2.4 linux kernel) does not have vtune and I'm doubtful that the situation will change.
Thanks again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Threading forum:
http://softwareforums.intel.com/ids/board?board.id=42
You should be able to find Intel Forum page simply by going up the URL tree.
http://softwareforums.intel.com/ids/board?board.id=42
You should be able to find Intel Forum page simply by going up the URL tree.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I agree that analysis of the thread performance problems you raise requires, at a minimum, some type of thread profiling. I am not familiar with support which Altix may provide for this.
Among the Intel tools, the -openmp_profile compiler link option is a useful start. It produces a file which can be analyzed by a separate installation of Vtune with Thread Profiler on a 32-bit desktop. You can analyze performance by thread, varying the number of threads, and determine which threads may have performance scaling problems, such as those related to non-local data placement.
Among the Intel tools, the -openmp_profile compiler link option is a useful start. It produces a file which can be analyzed by a separate installation of Vtune with Thread Profiler on a 32-bit desktop. You can analyze performance by thread, varying the number of threads, and determine which threads may have performance scaling problems, such as those related to non-local data placement.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page