Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.
1696 Discussions

Performance is worse with hyperthreading on dual xeon w5580

shyaki
Beginner
527 Views
Our tests show that the live HD capture performance is much worse with hyperthreading than without it on the new HP Z800 system.

The new Nahalem processor has small individual L2 cache (512K per core) with an relatively large L3 cache shared by the four cores. Our threads all execute data-hungary tasks. Do you think the small L2 cache may cause too many misses with hyperthreading?
0 Kudos
3 Replies
TimP
Honored Contributor III
527 Views
Quoting - shyaki
Our tests show that the live HD capture performance is much worse with hyperthreading than without it on the new HP Z800 system.

The new Nahalem processor has small individual L2 cache (512K per core) with an relatively large L3 cache shared by the four cores. Our threads all execute data-hungary tasks. Do you think the small L2 cache may cause too many misses with hyperthreading?
It's certainly possible, if a thread of your application requires more than half the L1 or L2 cache, or 8 threads together require more than than the entire L3, that cache capacity problems would produce the effect you reported.
If there is an advantage to specifying which threads share cores and associated caches, you would need that specification (KMP_AFFINITY for Intel OpenMP).
Data intensive applications with good memory locality are likely not to benefit from HT, as you could use up the entire memory bandwidth with 1 thread per core.
0 Kudos
Tom_Spyrou
Beginner
527 Views

If you use vtune you should be able to see the cache behavior reported. If you are not a vtune user it is a good tool and makes use of the on chip hardware profiling to measure details of what is happening on the processor. You could compare the reports single and multi-threaded and see what is going on.
0 Kudos
Roman_D_Intel
Employee
527 Views
Quoting - shyaki
Our tests show that the live HD capture performance is much worse with hyperthreading than without it on the new HP Z800 system.

The new Nahalem processor has small individual L2 cache (512K per core) with an relatively large L3 cache shared by the four cores. Our threads all execute data-hungary tasks. Do you think the small L2 cache may cause too many misses with hyperthreading?

I agree with the previous posts. You might find this guide usefulhttp://software.intel.com/en-us/articles/using-intel-vtune-performance-analyzer-to-optimize-software-on-intel-core-i7-processors/ to find out if cache misses harm your performance.
0 Kudos
Reply