We have recently purchased a dualIntel X5650workstation to run an internally-developed floating-point intensive simulation, under Ubuntu 10.04.
Each X5650 has 6 cores, so there are 12 cores in total. The code is trivially parallel, so I have been running it mostly with 12 threads, and observing approximately "1200%" processor utilization through "top".
HyperThreading is enabled in the BIOS, so the operating system nominally sees 24 cores available. If I increase the number of threads to 24, top reports approximately 2000% processor utilization - however, it does not appear that the actual code performance increases by 20/12.
My question is - how does HyperThreading actually work on the latest generation of Xeons? Would a floating-point intensive code benefit from scheduling more than one thread per core? Does the answer change if the working set is on the order of the cache size, as compared to several times larger, or if there are substantial I/O operations (e.g. writing simulation outputs to disk)?
Additionally - how should I interpret processor utilization percentages from "top" when hyperthreading is enabled?