- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I was wondering what is the difference between a system being NUMA and a system having NUMA enabled? Moreover, how can I tell if the system is NUMA inside a compiler? Does being a NUMA system depends solely on the processor inside the system? Therefore, if I have processor X, than I can tell only based on this that the systems having such processors are NUMA systems?
Thank you,
Iulia
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Excepting for Xeon Phi 200 series, systems with single CPU will only have one memory system. Xeon Phi 200 series can optionally be partitioned as 1, 2 or 4 compute nodes (MCDRAM configurable either as cache or additional memory nodes).
If your system has multiple Xeon CPUs, and are of Nehalem or later generations, then it likely is capable of being configured as multiple NUMA nodes. This is configurable as a BIOS setting (you also require a NUMA enabled O/S).
To check NUMA capability on Windows, the task manager can be inspected. If enabled, the available NUMA nodes are listed at the performance tab.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the answer. I still have several question, please answer each if possible. I am using Windows 10 and in Task Manager -> Performance Tab I see the number of sockets equal to 1. Are those the NUMA nodes/CPUs? I also see Virtualization enabled, does this mean NUMA is enabled on my sistem? How do I know if my OS is NUMA enabled? My sistem has an Intel Core i7 processor. Can you please tell me if only Xeon processors are NUMA system? My understanding is that NUMA can be enabled if the system has several CPUs, regardless of the processors inside the system. Is that correct?
Iulia
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you have 1 socket, you have 1 CPU. This one CPU has several logical processors. The logical processors are the hardware threads that the CPU is configured to run (HyperThreading is enabled or disabled as a BIOS setting). As stated earlier, excepting for KNL, the number of NUMA nodes is typically tied to the number of CPUs (sockets), as each socket may have separate memory slots adjacent to the CPU. Note, some motherboards with dual (more than 1) socket can have a single memory subsystem, thus 1 node.
NUMA nodes have nothing to do with virtualization. On a 2 socket system, each socket has its own memory subsystem. Each CPU (socket) can access its own memory subsystem as well as access the other memory subsystem(s). Access to the local memory subsystem is faster than access to the other memory subsystem(s).
Core i7 only supports 1 socket, thus only one memory node (which can have 1, 2, ... memory channels).
Xeon processors have processor numbering such as E5-1620, E5-2620, E5-4620, E5-8620), where the -n (1,2,4,8) is the maximum number of sockets that the CPU can be used in.
Now, to have multiple NUMA nodes:
1) motherboard must have multiple sockets
2) motherboard must have multiple memory subsystems (populated)
3) motherboard must have multiple CPUs (not all sockets need to be filled)
4) only memory subsystems with CPU are usable
5) BIOS must configure memory subsystems for NUMA configuration (else configure for interleaved use)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What I would like to know as well is:
1. SYSTEM_INFO data structure is used in C to get at runtime the number of logical threads of the underlying architecture (by accessing the field dwNumberOfProcessors). Is there any way to return the number of sockets or the number of CPUs/NUMA nodes? How about the number of cores per NUMA node?
2. @Sergey Kostrov, is your answer above restricted to Intel Technologies or is it a general one?
3. @Jim Dempsey, regarding Intel Xeon Phi, what other processors besides it are part of MIC Architecture?
4. @Jim Dempsey, @Sergey Kostrov, I know MIC Architecture allows programming using OpenMP. Do all Intel Xeon processors support OpenMP or only Intel Xeon Phi?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
4. All Intel(r) Xeon processors and compatibles are supported by openmp in the widely used compiler systems such as Intel, gnu, clang, Oracle, pgi.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. You can use CPUID instructions:
https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf
The above is just about everything there is to know about Intel CPUs... but may be too daunting for many readers.
was found by Googling: cpuid sockets cores
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The original post seemed possibly to refer to the NUMA enabling BIOS option frequently provided on early 2-socket NUMA platforms. They were shipped with the BIOS "NUMA disabled" meaning that cache lines were stored alternately by each CPU on remote and local memory. In order to take advantage of local memory when applying affinity settings, it was necessary to turn on enable NUMA setting in BIOS.
Technically, NUMA may refer to a variety of features which don't appear to be under discussion here.
I just did some more tests on the OpenMP support in the most recent Microsoft VIsual Studio. It is still limited to the combination of OpenMP 2.0 and, in the case of C source code, the C89 standard, even though a number of C99 features have been introduced for use exclusive of OpenMP constructs. For example, for(int i=0; ....) is accepted when not preceded by #pragma omp for, but not
#pragma omp for
for(int i=0; ....
which is rejected with same error message as for( ; .....)
Reluctantly, I have changed much OpenMP code to use #if _OPENMP >= 23017 so as to not use recent features with Microsoft compiler and to conform to Microsoft subset of OpenMP elsewhere. As Microsoft OpenMP code is supported by Intel libiomp5 (and maybe by some version of the llvm), it is possible to set affinities by replacing the Microsoft library linkage.
Note that http://www.openmp.org/resources/openmp-compilers/ gives supported platforms for many present or past OpenMP implementations.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you all for the answers.
I would want to pose two more questions:
A. Having the following processor:
Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
I looked at the link you provided (https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf) and I see that this processor can be only v2 or v3. How can I find out which one is it? What do v2 and v3 mean: are they sub architectures?
B. I am using the following OS:
Linux 4.4.0-66-generic #87-Ubuntu SMP Fri Mar 3 2017 x86_64 x86_64 x86_64 GNU/Linux
What is the ABI (Application Binary Interface) for the processor and the OS above?
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Many e5-2570 sandy bridge cpu were shipped. V2 would be ivy bridge and v3 haswell.
Linux x86_64 abi is well documented.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you. I know v2 is ivy bridge and v3 is haswell, but I don't know how to find out which one of the two is the processor I am using.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would still bet on Sandy bridge, but you could test that by seeing if ivy bridge specific code fails, or by checking for Sandy bridge specifics like very slow unaligned access. There are more differences in V3 such as support for fma.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
lulia,
cat /proc/cpuinfo | grep 'model name'
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you please answer: If ivy bridge is considered v2 and haswell is v3, which version is considered sandy bridge for above processor? Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No version specification, i.e. v0 or v1, would be expected to signify Sandy Bridge. I guess they didn't get approval to plan ahead when Sandy Bridge BIOS report was set up.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page