Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.

State of AVX 512 on Skylake-X

jan_v_
New Contributor I
3,870 Views

As has been stated on a number of review sites, AVX 512 performance on the 6/8 core Skylake-X is compromised.
Only on the 10 core, the present hardware is fully enabled.
Would Intel be so kind as to provide in depth detail of what the performance difference means ?
From the vague information available it seems one of 2(3?) AVX 512 ports is disabled (port 5).
Can we get more detailed information, which ports are used for AVX 512 ?
What AVX 512 instructions can the ports execute, do they have 512-bit data paths to registers/cache ?
How is AVX 512 gather affected regarding the 6/8 core versus 10 core ?
A similar drawing as below for AVX2 would be appreciated.


IMG0038528_1.jpg

 

 


 

0 Kudos
10 Replies
andysem
New Contributor III
3,870 Views

I'm not an Intel representative, but this is how I understand the article. The 6- and 8-core models have one of the two FMA units disabled (the one connected to Port 5), thus FMA instructions only having half the throughput of the 10-core model. One 512-bit register contains 8 DP FP elements, so from the article it follows that FMA instructions have reciprocal throughput of 0.5 on 6- and 8-core models and 0.25 on the 10-core model.

Ports 0, 1 and 5 are all enabled on all Skylake-X CPU models. Ports 0 and 1 are used for most 256-bit vector instructions and can fuse together to issue a 512-bit vector instruction (i.e. to execute the same 256-bit instruction on the two 256-bit lanes). Port 5 is 512-bit and can also issue 512-bit vector instructions. It is additionally used for cross-lane operations, such as shuffles. On the 10-core CPU its is also used for the second FMA unit.

Apparently, what follows from this is that most of the 512-bit instructions should have at most the 2/3 throughput compared to the corresponding 256-bit counterparts. But I have not seen any numbers yet to confirm that.

 

0 Kudos
jan_v_
New Contributor I
3,870 Views

Some people that have bought the 7800x now claim, based on benchmarks, both FMA 512 units are enabled on the 6 core.
Can somebody from Intel please confirm this ?
 

 

0 Kudos
jan_v_
New Contributor I
3,870 Views

Got myself a 7820X.
I can confirm it has both FMAs enabled in AVX 512.
Thanks for the clear communication Intel !

 

0 Kudos
McCalpinJohn
Honored Contributor III
3,870 Views

Fortunately this information is included in the Intel ARK entries for the server parts.  For example, the Xeon Platinum 8160 description at https://ark.intel.com/products/120501/Intel-Xeon-Platinum-8160-Processor-33M-Cache-2_10-GHz includes

# of AVX-512 FMA Units                2

This is the correct answer for this processor.  

In general, the Platinum series processors and the Gold 6000 series processors all have 2 FMA units, and the other processors have 1 FMA unit.  I know of at least one exception -- the Gold 5122 has 2 FMA units.   I don't know if there are other exceptions -- there are 58 processor models and the number of FMA units is not a field that can be used with the advanced search function.

0 Kudos
TAcco1
Beginner
3,870 Views

Thanks for the update Jan. Wish Intel would respond, more information would be nice.

0 Kudos
jan_v_
New Contributor I
3,870 Views

In case you have one of those Skylake-X processors, and want to find out if it has 2 AVX 512 FMAs.
Here a real time AVX2 / AVX512 / GPU Julia/Mandelbrot zoomer:
All computations done with double precision. Very much optimized with FMA computations and multi-threading.
You can switch from AVX512 to AVX2. If you notice a big difference in frames per second you can assume to have 2 AVX512 FMAs
Computation speed is up to 60 FPS at 4K resolution on an 8 core running at 4 Ghz using AVX512.

0 Kudos
Jeffrey_H_Intel
Employee
3,870 Views

As John already indicated, the AVX-512 unit count is provided for all of the parts enumerated on https://ark.intel.com/products/series/125191/Intel-Xeon-Scalable-Processors.

0 Kudos
jan_v_
New Contributor I
3,870 Views

Information about Xeon is totally useless if the question is about information for Skylake-X.
http://ark.intel.com/products/123767/Intel-Core-i7-7820X-X-series-Processor-11M-Cache-up-to-4_30-GHz
No information about nr AVX 512 units for Skylake-X as you can see.

0 Kudos
Jeffrey_H_Intel
Employee
3,870 Views
0 Kudos
jan_v_
New Contributor I
3,870 Views

So it took Intel about 1 year to add the correct information of 2 AVX-512 FMA units for Skylake-X.
Congratulations !

0 Kudos
Reply