Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Calculate the Max Flops on Skylake

GHui
Novice
19,707 Views

 

I calculate the Max Flops on Skylake with cpu- frequency*16.

Is cpu-frequency*32 on GOLD version of Skylake? 

 

31 Replies
Giuseppe_C_
Beginner
1,998 Views
0 Kudos
McCalpinJohn
Honored Contributor III
9,389 Views

The "Base AVX-512 Core Frequency (GHz)" is the frequency that the processor will use when running 512-bit SIMD instructions in the absence of any Turbo boost.  The columns to the right show the maximum frequency that the processor will use when running 512-bit SIMD instructions with various numbers of active cores.

The maximum frequency is only available if the temperature does not exceed the temperature limits, the package power does not exceed the package power limits, and the electrical current does not exceed the current limits.    If any of these limits are exceeded, the frequencies will be reduced until none of the limits are exceeded.

The "Base AVX-512 Core Frequency (GHz)" is also intended to represent the minimum frequency that will ever be seen for any power-limited workload (assuming a correctly configured cooling system).    So this frequency can be used to compute a lower bound on the peak GFLOPS.  For example, the Xeon Platinum 8180 as a "Base AVX-512 Core Frequency" of 1.7 GHz, with 28 cores, and two AVX-512 units per core, giving a lower bound

Lower Bound:     28 cores * 1.7 GHz * 32 FLOPS/Hz = 1523.2 GFLOPS (per socket).

The maximum 28-core AVX-512 frequency of 2.3 GHz provides an upper bound on the peak performance

Upper Bound:     28 cores * 2.3 GHz * 32 FLOPS/Hz = 2060.8 GFLOPS (per socket)

The actual frequency when running compute-intensive AVX512 workloads depends on the unique characteristics of the specific piece of silicon  (particularly leakage current), as well as the characteristics of the cooling system (ambient temperature, heat sink thermal conductivity, air flow rate, etc).

We have 3472 Xeon Platinum 8160 (24-core0 processors in 1736 two-socket nodes.   The Base AVX-512 Core Frequency for these processors is 1.4 GHz and the maximum 24-core AVX-512 frequency is 2.0 GHz.  When running Intel's optimized LINPACK benchmark, we see that the average frequency of these processors varies between about 1.52 GHz and about 1.73 GHz, with sustained (LINPACK) performance varying by the same proportions.

0 Kudos
GHui
Novice
9,369 Views

For example, the Xeon Platinum 8180, Processor Base Frequency is 2.5GHz, Max Turbo Frequency is 3.8GHz(from ark.intel.com). May I use 2.5 or 3.8 to calculate the Max GFLOPS?

0 Kudos
fang__wei
Beginner
9,369 Views

You should use 2.5 to calculate Max Gflops. You can't achieve the Max Turbo if you are using all the cores.

0 Kudos
McCalpinJohn
Honored Contributor III
9,365 Views

The rightmost column of the first row of the table shows that 2.3 GHz is the maximum Turbo frequency that the Xeon Platinum 8180 allows when using all cores and running AVX512 code.  2.3*32*28 = 2060.8 GFLOPS for double-precision FMA operations on 512-bit vectors.

0 Kudos
GHui
Novice
5,620 Views

CPU processor Max Turbo Frequency is 3.8GHz, but AVX-512 Max Turbo Frequency is only 2.3GHz. Why is this?

And calculate Max GFLOPS always use ( CPU-Base-Frequency * cores * 32(or 16, 8, 4) ) before.

0 Kudos
GHui
Novice
2,374 Views

Are there any relations between AVX-512 Frequency and CPU Processor Frequency?

0 Kudos
McCalpinJohn
Honored Contributor III
2,374 Views

The nomenclature can be confusing.  The "non-AVX" frequencies, the "AVX 2.0" frequencies, and the "AVX-512" frequencies are all CPU core frequencies, with the different tables applying to code with different SIMD register widths in use.

For any CPU model, the maximum frequency that each core can run at is determined by both the number of active cores and by the width of the SIMD units that are active in each core.   In figure 3 above, the label "AVX-512 Turbo Frequencies" means the maximum frequency that any core can run at when the 512-bit SIMD units are active.  (The 512-bit SIMD units are active whenever the processor has executed an instruction using 512-bit registers in the last millisecond.)

As an example, we can compare the values in Figures 1, 2, 3 from the "Intel Xeon Processor Family Specification Update" (document 336065-005, February 2018) for the Xeon Platinum 8180.   If there are 20 active cores:

  • any of the cores with the 512-bit SIMD units active can run at up to 2.6 GHz (Figure 3, row 1, column for 20 cores active)
  • any of the cores with the 256-bit SIMD units active can run at up to 3.1 GHz (Figure 2, row 1, column for 20 cores active)
  • any of the cores with neither 256-bit nor 512-bit SIMD units active can run at up to 3.5 GHz (Figure 1, row 1, column for 20 cores active)

These tables provide the maximum turbo frequencies for each case -- the actual frequencies will be reduced if needed to keep the running average power under the TDP limit for the processor model. 

0 Kudos
Roberson
Beginner
2,374 Views

Hi McCalpin, your comments is really very helpful. But I still have a question and I didn't find out the answer from Google or Intel Official web site.

1.We usually calculate CPU performance by Frequency*IPC*Cores. But Intel does not seem to release IPC of each CPU. So my first question is how to get IPC of each CPU.

2.According to your comments above, it seems we can not use the processor basic frequency released by Intel on the official site. Is there a formula which can calculate the CPU performance using the Processor Basic Frequency ?

Frequency.png

Thank you very much.

McCalpin, John (Blackbelt) wrote:

The nomenclature can be confusing.  The "non-AVX" frequencies, the "AVX 2.0" frequencies, and the "AVX-512" frequencies are all CPU core frequencies, with the different tables applying to code with different SIMD register widths in use.

For any CPU model, the maximum frequency that each core can run at is determined by both the number of active cores and by the width of the SIMD units that are active in each core.   In figure 3 above, the label "AVX-512 Turbo Frequencies" means the maximum frequency that any core can run at when the 512-bit SIMD units are active.  (The 512-bit SIMD units are active whenever the processor has executed an instruction using 512-bit registers in the last millisecond.)

As an example, we can compare the values in Figures 1, 2, 3 from the "Intel Xeon Processor Family Specification Update" (document 336065-005, February 2018) for the Xeon Platinum 8180.   If there are 20 active cores:

any of the cores with the 512-bit SIMD units active can run at up to 2.6 GHz (Figure 3, row 1, column for 20 cores active)

any of the cores with the 256-bit SIMD units active can run at up to 3.1 GHz (Figure 2, row 1, column for 20 cores active)

any of the cores with neither 256-bit nor 512-bit SIMD units active can run at up to 3.5 GHz (Figure 1, row 1, column for 20 cores active)

These tables provide the maximum turbo frequencies for each case -- the actual frequencies will be reduced if needed to keep the running average power under the TDP limit for the processor model. 

0 Kudos
McCalpinJohn
Honored Contributor III
2,374 Views

(1) The information you need on max operations per cycle is not easy to find, but all the available information is contained in the posts above.

(2) There is no formula to determine the maximum all-core Turbo frequency from the "nominal" frequency.  You have to look up the model-specific information in the "Specification Update" document for the processor family you are interested in.

0 Kudos
Roberson
Beginner
2,374 Views

Thank you very much. I have already got the information you mentioned.

McCalpin, John (Blackbelt) wrote:

(1) The information you need on max operations per cycle is not easy to find, but all the available information is contained in the posts above.

(2) There is no formula to determine the maximum all-core Turbo frequency from the "nominal" frequency.  You have to look up the model-specific information in the "Specification Update" document for the processor family you are interested in.

0 Kudos
Reply