Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

where are the others 4 cores?

dingjun_chencmgl_ca
802 Views

 

Hi, Intel support guys,

I am running tests on our SKYLAKE computers. I am  surprise to see there are 4 cores/pkg gone. Where are they? 

Our computer system information is below:

Process: Intel Xeon Gold 6148 CPU@2.40GHz 2.39GHz (2 processors)

Installed memory: 384GB

System type: 64-bit operating system x64-based processor

OS: Windows server 2016 standard

Please see the following outputs and you will see that 4 cores per package are gone. where are these 8 cores in total?

I am looking forward to hearing from you.

Thanks in advance

Best regards,

Dingjun

Computer Modelling Group Ltd.

Calgary, AB, Canada

 

 

VECTOR_SIMD_OPENMP_TEST
OMP: Info #211: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #209: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31}
OMP: Info #156: KMP_AFFINITY: 32 available OS procs
OMP: Info #158: KMP_AFFINITY: Nonuniform topology
OMP: Info #179: KMP_AFFINITY: 2 packages x 20 cores/pkg x 1 threads/core (32 total cores)
OMP: Info #213: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #171: KMP_AFFINITY: OS proc 0 maps to package 0 core 0
OMP: Info #171: KMP_AFFINITY: OS proc 1 maps to package 0 core 1
OMP: Info #171: KMP_AFFINITY: OS proc 2 maps to package 0 core 2
OMP: Info #171: KMP_AFFINITY: OS proc 3 maps to package 0 core 3
OMP: Info #171: KMP_AFFINITY: OS proc 4 maps to package 0 core 4
OMP: Info #171: KMP_AFFINITY: OS proc 5 maps to package 0 core 8
OMP: Info #171: KMP_AFFINITY: OS proc 6 maps to package 0 core 9
OMP: Info #171: KMP_AFFINITY: OS proc 7 maps to package 0 core 10
OMP: Info #171: KMP_AFFINITY: OS proc 8 maps to package 0 core 11
OMP: Info #171: KMP_AFFINITY: OS proc 9 maps to package 0 core 12
OMP: Info #171: KMP_AFFINITY: OS proc 10 maps to package 0 core 16
OMP: Info #171: KMP_AFFINITY: OS proc 11 maps to package 0 core 17
OMP: Info #171: KMP_AFFINITY: OS proc 12 maps to package 0 core 18
OMP: Info #171: KMP_AFFINITY: OS proc 13 maps to package 0 core 19
OMP: Info #171: KMP_AFFINITY: OS proc 14 maps to package 0 core 20
OMP: Info #171: KMP_AFFINITY: OS proc 15 maps to package 0 core 24
OMP: Info #171: KMP_AFFINITY: OS proc 16 maps to package 0 core 25
OMP: Info #171: KMP_AFFINITY: OS proc 17 maps to package 0 core 26
OMP: Info #171: KMP_AFFINITY: OS proc 18 maps to package 0 core 27
OMP: Info #171: KMP_AFFINITY: OS proc 19 maps to package 0 core 28
OMP: Info #171: KMP_AFFINITY: OS proc 20 maps to package 1 core 0
OMP: Info #171: KMP_AFFINITY: OS proc 21 maps to package 1 core 1
OMP: Info #171: KMP_AFFINITY: OS proc 22 maps to package 1 core 2
OMP: Info #171: KMP_AFFINITY: OS proc 23 maps to package 1 core 3
OMP: Info #171: KMP_AFFINITY: OS proc 24 maps to package 1 core 4
OMP: Info #171: KMP_AFFINITY: OS proc 25 maps to package 1 core 8
OMP: Info #171: KMP_AFFINITY: OS proc 26 maps to package 1 core 9
OMP: Info #171: KMP_AFFINITY: OS proc 27 maps to package 1 core 10
OMP: Info #171: KMP_AFFINITY: OS proc 28 maps to package 1 core 11
OMP: Info #171: KMP_AFFINITY: OS proc 29 maps to package 1 core 12
OMP: Info #171: KMP_AFFINITY: OS proc 30 maps to package 1 core 16
OMP: Info #171: KMP_AFFINITY: OS proc 31 maps to package 1 core 17
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 4004 thread 0 bound to OS proc set {0}
  The number of processors available =       32
  The number of threads available    =       20
  HELLO from process        0
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 8956 thread 1 bound to OS proc set {1}
  HELLO from process        1
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 8820 thread 2 bound to OS proc set {2}
  HELLO from process        2
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9292 thread 3 bound to OS proc set {3}
  HELLO from process        3
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9752 thread 4 bound to OS proc set {4}
  HELLO from process        4
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 3776 thread 5 bound to OS proc set {5}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 8464 thread 6 bound to OS proc set {6}
  HELLO from process        5
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 1416 thread 7 bound to OS proc set {7}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 3868 thread 8 bound to OS proc set {8}
  HELLO from process        6
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 7396 thread 9 bound to OS proc set {9}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9772 thread 10 bound to OS proc set {10}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9280 thread 11 bound to OS proc set {11}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9948 thread 12 bound to OS proc set {12}
  HELLO from process        7
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 8712 thread 13 bound to OS proc set {13}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 6092 thread 14 bound to OS proc set {14}
  HELLO from process       11
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 8532 thread 15 bound to OS proc set {15}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9892 thread 16 bound to OS proc set {16}
  HELLO from process       12
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 10640 thread 17 bound to OS proc set {17}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9060 thread 18 bound to OS proc set {18}
  HELLO from process       14
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 8908 thread 19 bound to OS proc set {19}
  HELLO from process       18
  HELLO from process       16
  HELLO from process       19
  HELLO from process       13
  HELLO from process        8
  HELLO from process       17
  HELLO from process       15
  HELLO from process       10
  HELLO from process        9
matrix multiplication completed
  Elapsed wall clock time 2 =    133.379

0 Kudos
6 Replies
McCalpinJohn
Honored Contributor III
802 Views

The Xeon Gold 6148 is a 20-core processor, and your output shows 20 threads running.

The output of KMP_AFFINITY includes 20 core identifiers that are not sequential: 0,1,2,3,4,...,8,9,10,11,12,...,16,17,18,19,20,...,24,25,26,27,28.  
The specific values of these identifiers are not particularly meaningful -- the important thing is that they have a 1:1 mapping to the logical processors, so the KMP_AFFINITY output confirms that each of the OpenMP threads is bound to a distinct core.

I have a pile of 24-core Xeon Platinum 8160 processors, and they show similar behavior, except that a different set of values are missing from the core identifiers.   On these processors the KMP_AFFINITY core identifier list is: 0,1,2,3,4,5,...,8,9,10,11,12,13,...16,17,18,19,20,21,...24,25,26,27,28,29.

On both these systems, the KMP_AFFINITY core identifiers are scaled versions of the X2APIC identifiers that you can obtain using the CPUID instruction.  The X2APIC identifiers are low-level hardware identifiers for the 3rd (?) generation of Intel's Advanced Programmable Interrupt Controller technology, so they are generally only of interest to a small number of system software developers.   Their use here is probably convenient for Intel, but it has confused many users (including me...)

On your 20-core parts, the KMP_AFFINITY core identifiers are grouped into 4 sets of 5, while on my 24-core parts the KMP_AFFINITY core identifiers are grouped into 4 sets of 6.   This probably means something to Intel engineers/architects, but does not appear to have any significance to end users.

On the Xeon Platinum 8160 platform, I have verified that the missing numbers are the same on all processors, and are not related to the set of 4 cores that are disabled on the die.   It is probably safe to assume that this property holds on the Xeon Gold 6148 as well.   (This property does not hold on the Xeon Phi x200 -- on that processor, the missing numbers do correspond to the disabled cores.)

0 Kudos
dingjun_chencmgl_ca
802 Views

 

Thanks to McCalpin, John for your rpely.

This information is not correct, I think. Please see it:

OMP: Info #179: KMP_AFFINITY: 2 packages x 20 cores/pkg x 1 threads/core (32 total cores)

Why are there only 32 cores in total?



 

0 Kudos
dingjun_chencmgl_ca
802 Views

This error only occurs with MS visual studio for building the executable. If command line ifort.exe is used, everything is normal and there is no such an error occurred. Could someone explain more about it?

Please see the following output info from the executable file build with command line ifort.exe:

ifort /Qopenmp /Qxcommon-avx512 /align:array64byte vector_openmp_test_v9.f90 -o paralleltests_commonavx512

D:\users\dingjun\exe>paralleltests_coreavx512.exe


VECTOR_SIMD_OPENMP_TEST
OMP: Info #211: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #209: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39}
OMP: Info #156: KMP_AFFINITY: 40 available OS procs
OMP: Info #157: KMP_AFFINITY: Uniform topology
OMP: Info #179: KMP_AFFINITY: 2 packages x 20 cores/pkg x 1 threads/core (40 total cores)
OMP: Info #213: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #171: KMP_AFFINITY: OS proc 0 maps to package 0 core 0
OMP: Info #171: KMP_AFFINITY: OS proc 1 maps to package 0 core 1
OMP: Info #171: KMP_AFFINITY: OS proc 2 maps to package 0 core 2
OMP: Info #171: KMP_AFFINITY: OS proc 3 maps to package 0 core 3
OMP: Info #171: KMP_AFFINITY: OS proc 4 maps to package 0 core 4
OMP: Info #171: KMP_AFFINITY: OS proc 5 maps to package 0 core 8
OMP: Info #171: KMP_AFFINITY: OS proc 6 maps to package 0 core 9
OMP: Info #171: KMP_AFFINITY: OS proc 7 maps to package 0 core 10
OMP: Info #171: KMP_AFFINITY: OS proc 8 maps to package 0 core 11
OMP: Info #171: KMP_AFFINITY: OS proc 9 maps to package 0 core 12
OMP: Info #171: KMP_AFFINITY: OS proc 10 maps to package 0 core 16
OMP: Info #171: KMP_AFFINITY: OS proc 11 maps to package 0 core 17
OMP: Info #171: KMP_AFFINITY: OS proc 12 maps to package 0 core 18
OMP: Info #171: KMP_AFFINITY: OS proc 13 maps to package 0 core 19
OMP: Info #171: KMP_AFFINITY: OS proc 14 maps to package 0 core 20
OMP: Info #171: KMP_AFFINITY: OS proc 15 maps to package 0 core 24
OMP: Info #171: KMP_AFFINITY: OS proc 16 maps to package 0 core 25
OMP: Info #171: KMP_AFFINITY: OS proc 17 maps to package 0 core 26
OMP: Info #171: KMP_AFFINITY: OS proc 18 maps to package 0 core 27
OMP: Info #171: KMP_AFFINITY: OS proc 19 maps to package 0 core 28
OMP: Info #171: KMP_AFFINITY: OS proc 20 maps to package 1 core 0
OMP: Info #171: KMP_AFFINITY: OS proc 21 maps to package 1 core 1
OMP: Info #171: KMP_AFFINITY: OS proc 22 maps to package 1 core 2
OMP: Info #171: KMP_AFFINITY: OS proc 23 maps to package 1 core 3
OMP: Info #171: KMP_AFFINITY: OS proc 24 maps to package 1 core 4
OMP: Info #171: KMP_AFFINITY: OS proc 25 maps to package 1 core 8
OMP: Info #171: KMP_AFFINITY: OS proc 26 maps to package 1 core 9
OMP: Info #171: KMP_AFFINITY: OS proc 27 maps to package 1 core 10
OMP: Info #171: KMP_AFFINITY: OS proc 28 maps to package 1 core 11
OMP: Info #171: KMP_AFFINITY: OS proc 29 maps to package 1 core 12
OMP: Info #171: KMP_AFFINITY: OS proc 30 maps to package 1 core 16
OMP: Info #171: KMP_AFFINITY: OS proc 31 maps to package 1 core 17
OMP: Info #171: KMP_AFFINITY: OS proc 32 maps to package 1 core 18
OMP: Info #171: KMP_AFFINITY: OS proc 33 maps to package 1 core 19
OMP: Info #171: KMP_AFFINITY: OS proc 34 maps to package 1 core 20
OMP: Info #171: KMP_AFFINITY: OS proc 35 maps to package 1 core 24
OMP: Info #171: KMP_AFFINITY: OS proc 36 maps to package 1 core 25
OMP: Info #171: KMP_AFFINITY: OS proc 37 maps to package 1 core 26
OMP: Info #171: KMP_AFFINITY: OS proc 38 maps to package 1 core 27
OMP: Info #171: KMP_AFFINITY: OS proc 39 maps to package 1 core 28

 

0 Kudos
McCalpinJohn
Honored Contributor III
802 Views

Sorry I misunderstood your issue.   Something is clearly broken in the combination of MS Visual Studio and Windows Server 2016....

The KMP_AFFINITY "Info #171" messages show all 20 cores in package 0, but only shows 12 of the 20 cores in package 1.  The KMP_AFFINITY "Info #249" messages show that the 20 threads are all running in package 0.

In your first example, did you request a particular number of threads, or set any environment variables that might influence the runtime selection of threads or the binding of the job?

0 Kudos
dingjun_chencmgl_ca
801 Views

 

Hi, John,

Thanks again  for your reply.

The error mentioned above is below:

The number of processors available =       32

On my skylake computer there are 2 processors and each one has 20 cores and thus there should be 40 CPU cores in total.

The setting of the number of processors available is regardless of the setting of OMP_NUM_THREADS, I think.

So The number of processors available =   40   rather than    32

If my test codes is built via command line: ifort.exe rather than MS visual studio 2015, then this error does NOT occur. Is there a problem with the integration with MS visual STUDIO and Intel Fortran Compiler version 18.0.2.185 ?

I am looking forward to hearing from you again.

 

Dingjun

 

 

 

 

 


 

0 Kudos
McCalpinJohn
Honored Contributor III
802 Views

This certainly looks like a bug in the combination of the compiler, the runtime libraries, and the OS.   I don't use Windows, so I can't speculate on exactly what is going wrong.

1. My previous question was trying to look at why the numbers were different on these two lines:

  The number of processors available =       32
  The number of threads available    =       20

Clearly the 32 is wrong, but the 20 is a bit confusing as well. 

2. The software knows what the hardware is because of this line:

OMP: Info #179: KMP_AFFINITY: 2 packages x 20 cores/pkg x 1 threads/core (32 total cores)

Again, the 32 is obviously wrong, but it also clearly shows that the software sees 2 20-core packages. 

Both of these issues make me wonder if there are environment variables or other OS settings that the runtime is looking at to limit the resources that it thinks it can use.

0 Kudos
Reply