The Fortran ALLOCATE intrinsic function does not allocate memory from the thread stack, even if it is called within an OpenMP parallel region. Thread stacks are generally small and it's not always necessary to allocate memory in thread-private storage.
As far as I know, the Intel 9.0 compiler does not generate NUMA-specific code. The Fortran 90 ALLOCATE intrinsic simply allocates memory from the global heap. I'll try to get someone from the compiler team to verify this.
Thanks. When you refer the question include the following additional information.
The platform is WinXP Pro SP2 but installed from my MSDN subscription. i.e. Installation was to WinXP, then Windows Updated (several times) until through SP2. Also modified BOOT.INI to inclued /PAE.
I see no performance difference as I migratea pair ofthreads between processors on a 2-node NUMA system with 4 cores.
I believe I have the system BIOS set to not interlieve the NUMA nodes. Maybe that isn't functioning on the system BIOS. Because if all the memory were allocated on one node then you would expect a performance change as the processing moved from one node to the other (while data remained in the node of allocation).
I am trying to get the most out of the system.