- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have few servers each equipped with with dual icelake 8358 processors.
I would like to know that the following is correct method to measure theoretical Double Precision flops (RMax) -
= cores/socket * sockets * frequency * operations/cycle * elements/operation
= 32 * 2 * 2.6 * 2 * ( 512 register size / 64 bits DP )
= 32 * 2 * 2.6 * 2 * 8
= 2662.4
Also, will there be any difference apart from frequency and cores/socket variable values if i try to calculate FLOPS for icelake 6338 CPU Model or cascadelake 6230 ?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The formula above is missing a factor of two. There are two AVX512 FMA units. Each "lane" of each AVX512 unit performs two FP64 operations per cycle (Multiply + Add).
- 2 FMA units/core * 8 lanes/FMA unit * 2 FP ops/lane/cycle = 32 FP Ops/core/cycle
Then
- 2 sockets * 32 cores/socket = 64 cores
FInally
- 64 cores * 32 FP Ops/cycle = 2048 FP ops/cycle
The frequency on the Xeon Platinum 8358 will vary with the configuration (since this chip supports a mode that can split the cores into a "high priority" pool of 12 cores and a "low priority pool" of 20 cores"). In the more traditional configuration, the base AVX512 frequency is 1.9 GHz and the maximum all-core AVX512 Turbo frequency is 2.9 GHz. The actual average frequency when running compute-intensive workloads (i.e., close to 2 AVX512 FMA instructions per cycle) will be somewhere in the range of 1.9 to 2.9 GHz, but will vary depending on the current leakage characteristics of the specific piece of silicon as well as the effectiveness of the cooling system. We saw a 13% range in average frequency when running the HPL benchmark on an ensemble of 3472 Xeon Platinum 8160 processors (Skylake Xeon).
The "nominal" 2.6 GHz is within the range of 1.9 to 2.9 GHz, so it is a reasonable estimator for the maximum throughput, but it is not an upper bound. Using 2.9 GHz will give a strong upper bound because the chip will not allow all cores to run AVX512 512-bit FP code at more than 2.9 GHz. Using 1.9 GHz will give a lower bound -- if the frequency does not stay at 1.9 GHz or higher, there is a problem with the system (maybe processor, maybe cooling, maybe power supply, etc.) that you probably want to look into. (We recently saw a set of Xeon Platinum 8380 processors running at under the base AVX512 frequency -- the power supplies were overheating and throttling the processors. A simple adjustment to the fan speeds fixed the problem.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi John, do you mind sharing the AVX-512 table for Xeon Platinum 8358? I could not find the document. It's curious because for the 2nd Gen Xeon Scalable is relatively easy to find.
Thank you.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page