Hyperthreading should be very usefull but I guess that if I launch a program with mpi on a dual processor, I have two threads running on the cpu. In order to use the ht properly, I would like to get the two threads running only on physical cpu even if the treads can be balance between the logical cpu.
What I want to avoid is to get the two threads running on two logical cpu belonging to the same physical processor
Using Google search for
set thread affinity site:microsoft.com
Dual core introduced an intermediate level which schedulers must take into account, requiring corresponding information from the BIOS. When combined with HyperThreading, it makes all these demands on the scheduler even on a single package system.
The usual problem is to spread the work evenly across separate physical CPUs, and then across separate cores, before using multiple HyperThread logical processors on the same CPU. Red Hat EL3_U2, SuSE 9.3, and equivalent linux distros introduce multi-core aware scheduling which does that often enough to show measurable advantage over older schedulers.
It may happen that MS MPI will incorporate some kind of dual core aware scheduling, since it doesn't appear to be coming any time soon in the Windows scheduler.
Message Edited by tim18 on 10-28-2005 06:46 PM
From what I can interpret, HT suffers from a cache that was designed for a single (virtual) processor. If more effort is put into the cache design to eliminate aliasing of addresses then most of the adverse cache interaction would be eliminated (but there will undoubtably be a second most adverse cache interaction). Other than for dual cores or multi-cores, or multiple chips the cache interaction is likely to remain (why put the effort into fixing an old design).
This brings me to the question that someone might be able to answer. On a single core with HT can the cache-ing be disabled for one of the virtual processors? Let one thread run slower (but not trash the other thread's cache).
So If I fully understand to get the best usage of hyperthreading, I should use a linux kernel 2.5 or 2.6 in which the scheduler is optimized for HT.
Again, If I understand, with a 2.6 kernel, one a dual smp processor ht, when I run a mpi program on the two processor, the calculations is spread in two threads, the threads are launched on each physical cpu and only switch between the logical cpu belonging to a physical ones and if things going well at any time, we have the two threads of calculations running on 2 logical cpu belonging to only one physical.
I don't know if it is really clear but what I really want to avoid when using HT on a dual cpu box is to get the two threads of a calculations running one the same physical cpu while the second physical cpu is free and only occpuied by systems tasks.
But unfortunately the FORTRAN allocate as well as the C/C++ do not permit entering a hint at perfered alignment restrictions. e.g. if your application determines it is on a single core HT capable system with 4MB aliasing then there is no means to have the allocate specify a preference to obtain memory from a particular 2MB alignedportion of memory. The C++ programmer has the means to correct for this by replacing the new handler but the Fortran programmer does not have this functionality.