- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How do you do,
I have a few questions regarding the hyper-threading technology. I'm quite experienced in multithreaded computations in general, but have virtually no experience with HT.
(i) AFAIK, the very essence of HT is that when the currenlty running thread does not use some resources of the processor/core, another one can use them. Does it mean that the potential performance is better for threads running completely different tasks (e.g., one thread does the floating-point computations and the other one does not)? Because the other way they may compete for the same resources. Other words: does task parallelizm perform better with HT than data parallelizm?
(ii) I'm particularly interested in computations, requiring directed rounding (interval computations). Does setting the rounding mode affect *both* threads running on a core, using HT? This would be very unpleasent...
(iii) According to the Wikipedia article about Hyper-threading, practical performance boosts are about 10-20%. What are your experiences?
Thanks in advance for all answers or hints
Best regards
Bartlomiej
I have a few questions regarding the hyper-threading technology. I'm quite experienced in multithreaded computations in general, but have virtually no experience with HT.
(i) AFAIK, the very essence of HT is that when the currenlty running thread does not use some resources of the processor/core, another one can use them. Does it mean that the potential performance is better for threads running completely different tasks (e.g., one thread does the floating-point computations and the other one does not)? Because the other way they may compete for the same resources. Other words: does task parallelizm perform better with HT than data parallelizm?
(ii) I'm particularly interested in computations, requiring directed rounding (interval computations). Does setting the rounding mode affect *both* threads running on a core, using HT? This would be very unpleasent...
(iii) According to the Wikipedia article about Hyper-threading, practical performance boosts are about 10-20%. What are your experiences?
Thanks in advance for all answers or hints
Best regards
Bartlomiej
1 Solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(i) There are a few examples of effectiveness of HT for threads working on the same task but using different CPU resources, such as one thread performing floating point calculations and the other managing MPI communication. Of course, the case which is used to justify the expenditure on HT is resolution of frequent data page misses, as in certain data base applications. Another case where I see HT producing throughput improvement of more than 15% is where an application spends a majority of its time in floating point divide. The Core i7 update 3 architecture should cut down these delays significantly, consequently cutting the advantage seen with HT.
You could hardly generalize this to saying task parallelism is expected to work.
(ii) Some of the CPU settings which become important in applications which benefit from HT are controlled by MSR settings (typically set on the BIOS option screen), which apply to all jobs running on the CPU. Rounding modes set at user level by the application would apply to individual threads. If you had floating point applications where neither thread ties up 50% of the floating point unit resources or cache and hardware buffers, this shouldn't stop you from using HT.
(iii) Assuming you're considering Intel64 CPUs, a 10% boost by using HT is better than average. The numbers you mention were set way back in the days of single core HT CPUs; more effort is involved when attempting to take advantage of HT on a CPU with 4 or more physical cores (8 or more threads supported by HT).
You could hardly generalize this to saying task parallelism is expected to work.
(ii) Some of the CPU settings which become important in applications which benefit from HT are controlled by MSR settings (typically set on the BIOS option screen), which apply to all jobs running on the CPU. Rounding modes set at user level by the application would apply to individual threads. If you had floating point applications where neither thread ties up 50% of the floating point unit resources or cache and hardware buffers, this shouldn't stop you from using HT.
(iii) Assuming you're considering Intel64 CPUs, a 10% boost by using HT is better than average. The numbers you mention were set way back in the days of single core HT CPUs; more effort is involved when attempting to take advantage of HT on a CPU with 4 or more physical cores (8 or more threads supported by HT).
Link Copied
5 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(i) There are a few examples of effectiveness of HT for threads working on the same task but using different CPU resources, such as one thread performing floating point calculations and the other managing MPI communication. Of course, the case which is used to justify the expenditure on HT is resolution of frequent data page misses, as in certain data base applications. Another case where I see HT producing throughput improvement of more than 15% is where an application spends a majority of its time in floating point divide. The Core i7 update 3 architecture should cut down these delays significantly, consequently cutting the advantage seen with HT.
You could hardly generalize this to saying task parallelism is expected to work.
(ii) Some of the CPU settings which become important in applications which benefit from HT are controlled by MSR settings (typically set on the BIOS option screen), which apply to all jobs running on the CPU. Rounding modes set at user level by the application would apply to individual threads. If you had floating point applications where neither thread ties up 50% of the floating point unit resources or cache and hardware buffers, this shouldn't stop you from using HT.
(iii) Assuming you're considering Intel64 CPUs, a 10% boost by using HT is better than average. The numbers you mention were set way back in the days of single core HT CPUs; more effort is involved when attempting to take advantage of HT on a CPU with 4 or more physical cores (8 or more threads supported by HT).
You could hardly generalize this to saying task parallelism is expected to work.
(ii) Some of the CPU settings which become important in applications which benefit from HT are controlled by MSR settings (typically set on the BIOS option screen), which apply to all jobs running on the CPU. Rounding modes set at user level by the application would apply to individual threads. If you had floating point applications where neither thread ties up 50% of the floating point unit resources or cache and hardware buffers, this shouldn't stop you from using HT.
(iii) Assuming you're considering Intel64 CPUs, a 10% boost by using HT is better than average. The numbers you mention were set way back in the days of single core HT CPUs; more effort is involved when attempting to take advantage of HT on a CPU with 4 or more physical cores (8 or more threads supported by HT).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot for your in-depth answer.
Well, the HT technology is interesting as a gadget, but it does not seem very promising. Well, at least for my needs, fo course!
Many thanks and best regards
Well, the HT technology is interesting as a gadget, but it does not seem very promising. Well, at least for my needs, fo course!
Many thanks and best regards
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You would probably need to post an actual example, preferably using a support forum specific to the compiler you have in mind.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>Well, the HT technology is interesting as a gadget, but it does not seem very promising. Well, at least for my needs, fo course!
The system you run your application on is likely performing other functions useful to your application. Examples: display managment, file I/O management and buffering, networking, etc... A processor with say 4 cores, each with HT, has ample capacity to run these other non-FP functions with little impact on your application (assuming you restricted your application to 1 thread of each core).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There are several technical articles on Intel® Developer Zone about Intel® Hyper-Threading Technology:
Performance Insights to Intel® Hyper-Threading Technology
Intel® Hyper-Threading Technology: Analysis of the HT Effects on a Server Transactional Workload
Hyper-Threading: Be Sure You Know How to Correctly Measure Your Server’s End-User Response Time
Performance Insights to Intel® Hyper-Threading Technology
Intel® Hyper-Threading Technology: Analysis of the HT Effects on a Server Transactional Workload
Hyper-Threading: Be Sure You Know How to Correctly Measure Your Server’s End-User Response Time
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page