Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
New Contributor I
115 Views

Requirements for GFNI without AVX/AVX-512

Jump to solution

I noticed a recent commit to LLVM which mentions that Tremont supports GFNI without AVX-512.  Based on the patch, it looks like different functions require:

  • GFNI alone (since SSE2 is part of the x86_64 baseline)
  • GFNI + AVX
  • GFNI + AVX-512BW
  • GFNI + AVX-512BW + AVX-512VL

However, the intrinsics guide shows all the 128/256-bit functions as requiring AVX-512VL.  I assume this will be updated eventually now that GFNI is being split up, but how?  I was hoping to use LLVM as a reference, but they use macros for some of the functions and those don't have the attributes which tell which ISA extensions are required.

It makes sense to me that that 128-bit functions would require GFNI alone, 256-bit require GFNI and AVX, and 512-bit require GFNI and AVX-512BW, but that fourth category (GFNI + AVX-512BW + AVX-512VL) is confusing me…

I'd like to tweak some of my code so I can correctly detect which group of GFNI functions is available… does anyone have any insight into exactly which functions requite which ISA extensions?

0 Kudos

Accepted Solutions
Highlighted
New Contributor III
105 Views

You can see GFNI instruction encodings in the Software Developer's Manual. There are three versions of encodings:

  • Legacy SSE with 0x66 prefix. Only supports 128-bit vectors xmm0-xmm15.
  • VEX-encoded, which is compatible with AVX/AVX2. Supports 128 and 256-bit vectors x/ymm0-x/ymm15.
  • EVEX-encoded, which is compatible with AVX-512.  Supports 128, 256 and 512-bit vectors x/y/zmm0-x/y/zmm31.

In the AVX-512 case, in order to have support for 128 and 256-bit vectors, AVX-512VL is required. The usual difference between SSE and AVX instructions also apply - SSE instructions don't zero the upper bits of the output vector registers.

SDM also describes the CPUID features that are required for each of the encodings to be supported:

  • GFNI alone for SSE encoding
  • AVX+GFNI for VEX encoding
  • AVX-512F+GFNI for EVEX encoding and additionally AVX-512VL for 128 and 256-bit vectors.

As to which encodings are used for intrinsics, it is the compiler's decision. I believe, the compiler selects the encoding based on the target ISA, as specified in the command line or attributes applied to the function being compiled.

 

View solution in original post

0 Kudos
1 Reply
Highlighted
New Contributor III
106 Views

You can see GFNI instruction encodings in the Software Developer's Manual. There are three versions of encodings:

  • Legacy SSE with 0x66 prefix. Only supports 128-bit vectors xmm0-xmm15.
  • VEX-encoded, which is compatible with AVX/AVX2. Supports 128 and 256-bit vectors x/ymm0-x/ymm15.
  • EVEX-encoded, which is compatible with AVX-512.  Supports 128, 256 and 512-bit vectors x/y/zmm0-x/y/zmm31.

In the AVX-512 case, in order to have support for 128 and 256-bit vectors, AVX-512VL is required. The usual difference between SSE and AVX instructions also apply - SSE instructions don't zero the upper bits of the output vector registers.

SDM also describes the CPUID features that are required for each of the encodings to be supported:

  • GFNI alone for SSE encoding
  • AVX+GFNI for VEX encoding
  • AVX-512F+GFNI for EVEX encoding and additionally AVX-512VL for 128 and 256-bit vectors.

As to which encodings are used for intrinsics, it is the compiler's decision. I believe, the compiler selects the encoding based on the target ISA, as specified in the command line or attributes applied to the function being compiled.

 

View solution in original post

0 Kudos